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2031 OXIDOREDUCTASE 



Field of the invention 

The present invention relates to a method of screening for an anti-fimgal agent, to 
fungal 2031 oxidoreductase (2031 OR) enzymes and to diagnosis and therapy of 
fungal infections. 

Background of the invention 

Oxidoreductases are a major class of enzymes (EC 1) that catalyse oxidation-reduction 
(redox) reactions. Redox reactions involve the transfer of reducing equivalents, in the 
form of electrons or hydrogen atoms, between molecules, i.e., from an electron donor 
(or reductant) to an electron acceptor (or oxidant). There are many different types of 
oxidoreductase important for many cellular processes from respiration to protein 
folding. 

The NADHrflavin oxidoreductase /NADH oxidase family of enzymes (InterPro 
reference IPR001155) contains approximately 263 members mostly of bacterial or 
yeast origin but with some plant and nematode members. Members of this family use 
flavin mononucleotide (FMN) or flavm adenine dinucleotide (FAD) as a tightly bbxmd 
prosthetic group. The flavin prosthetic group can exist in an oxidised (FMN or FAD) 
or a reduced form (FMNH2 or FADH2). These oxidoreductases use the reduced form 
of nicotinamide adenine dinucleotide (NADH) or nicotinamide adenine dinucleotide 
phosphate (NADPH) as the reductant. A variety of substrates can act as oxidants in the 
redox reaction. 

Old Yellow Enzyme (OYE) is the oldest known member of this family of 
oxidoreductases (reviewed in Williams and Bruce, 2002, Microbiology 148, 1607- 
1614). OYEl (EC 1.6.99.1) was isolated from brewer's bottom yeast by Warburg & 
Christian (1932, Naturwissenschaflen 20, 688) and was the first enzyme for widch a 
cofactor was shown to be required (Theorell, 1935, Biochem. Z. 275, 344-346). This 
yellow cofactor was found to be riboflavin 5 '-phosphate (also known as flaviu 
mononucleotide, FMN). There are 2 OYEs known in Saccharomyces cerevisiae 
(OYE2 & OYE3) and 2 in Schizosaccharomyces pombe. A great deal is known about 
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• „ the biochemical mechanism and structure of the enzyme, however, the precise 
physiological role of the enzyme remains to be elucidated. 

OYE has NADPH dehydrogenase activity (see reaction 1 below). The reduced 
en2yme catalyses , the ..reduction of a/p-unsaturated carbonyl conapounds including 
,5 cyclohexenone (see reaction 2), duroquinone, menadione and N-ethylmaleimide. 

(1) Enz-FMN4-.2NADPH Enz-FMNHa + 2NADP"' 

(2) Enz-FMNH2 + 2-cyclohexenone Enz-FMN + cyclohexanone 
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It has been speculated that OYE may be iavolved in sterol metabolism (Stott et 
al, 1993, J. Biol. Chem. 268: 6097-6106) or may be part of the antioxidant defence 
machinery involved in detoxification of, for example, lipid peroxidation breakdown 
15 products (Kohli & Massey, 1998, J. BioL Chem. 273, 32763-32770). Neither OYE2 
nor OYE3 are essential for S. cerevisiae. (http ://genome-www4.stanford.edu/cgi- 
bin/SGD/locus.T3l?locus==S0001222 : 

http ://db .yeastgenome. org/cgibin/ S GD/locus .pl?locus=YPL 1 7 1 C) 

Bacterial members of the NADHiflavin oxidoreductase family include 
' 20 Escherichia cdli N-ethylmaleimide reductase, Pseudomonas putida MIO morphinone 
reductase, Enterobacter cloacae PB2 penterythritol tetranitrate reductase and 
Azoarciis evansii 2-a,mmobenzoyl"CoA monooxygenase/reductase (Schtihle et al., 
2001, J. BacteiioL 183, 5268-5278). 

25 Summary of the invention 

The inventors have, found a gene for an oxidoreductase of the NADHrflavin 
oxidoreductase type to be essential for the viability of fungal cells. This finding allows 
the identification of anti-fimgal agents based on their ability to target the 
oxidoreductase. 

3 0 The invention provides a new group of oxidoreductases which are herein referred 

to as 2031 oxidoreductases (2031 ORs) which can be used to screen for anti-fungal 
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agents. In particular 2031 oxidoreductases from Aspergillus- fumigatus, Aspergillus 
nidulans, Candida albicans, Colletotnchium trifolih Fusarium graminearum 
(anamorph Gibberella zeae) Fusarium sporotrichoides, MagnaportHe grisea, 
Neurospora crassa, Schizosaccharomyces pombe and Ustilago maydis (see Table I) 
5 are provided. 2031 OR defines a novel set of oxidoreductases, related to but distinct 
from OYE and its close relatives, whicb are essential for the viability of fungal cells. 

Accordingly the invention provides the following: 

- a. method of identifying an anti-fongal agent which targets an essential protein or 
10' gene of a fungus comprising contacting a candidate substance with, 

(i) a NADH:flavin.oxidoreductase protein which comprises the sequence shown 
bySEQIDNO:3, 

(ii) a NADHiflavin oxidoreductase protein which is a homologue of (i) and 
which comprises the sequence shown by SEQ ID NO: 8, 12, 14, 19, 24, 42, 44, 83 or 

15 85, 

(iii) a protein which.has 50% identity with (i) or (ii), 

(iv) a protein comprising a fragment of (i), (ii) or (iii) which fragment has a 
length of at least 50 amino acids, 

(v) a polynucleotide that comprises sequence which encodes (i), (ii), (iii) or (iv), 
20' (vi) a polynucleotide comprising sequence which has at least 70% identity with 

the coding sequence of (v), 

and determining whether the candidate substance binds or modulates (i), (ii), (iii), (iv), 
(v) or (vi), wherein binding or modulation of (i), (ii), (iii), (iv), (v) or (vi) indicates that 
the candidate substance is an anti-fungal agent, 
25 use of (i), (ii), (iii), (iv), (v) or (vi) as defined above to identify or obtain an anti- 
fungal agent, 

- use of an anti-fungal agent identified by the method of the invention in the 
manufacture of a medicament for prevention or treatment of fiongal infection, 

- a method of detecting the presence of a fungus in a sample comprising detecting the 
3 0 presence iu the said sample of a protein or polynucleotide of the invention, 

- an isolated protein or polynucleotide of the invention, 

- an organism which is transgenic for a polynucleotide of the hivention, 



- an organism which has been genetically engineered to render a polynucleotide or 
protein of the invention non-functional or inhibited. 

- an antibody wMcli is specific for a protein of the invention, 

' - a method for preventing or treating a fimgal infection comprising administering an 
5 anti-fungal agent identified by the screening method of the invention, and 

- a fungus which has been kUled, or whose growth has been impaired, by inhibition of 
the ejqjression or activity of a protein or polynucleotide of the invention. 

Detafled description of tihe invention 
) As mentioned above the invention relates to use of particular protein and 
polynucleotide sequences (termed "proteins of the invention" and "polynucleotides of 
. the invention" herein) which are of, or derived from, fungal oxidoreductase proteins 
and polynucleotides (including homologues and/or fragments of the ftingal 
oxidoreductase proteins and polynucleotides) to identify anti-fungal agents. 

As used herein, the term "oxidoreductase" ("OR") may be defined as an enzyme 
or which is capable of catalysing an oxidation or reduction reaction. The protein of the 
invention may have an oxidation or reduction activity, such any such activity 
mentioned herein. The ORs of the invention generaUy faU within classification ECl of 
the enzyme commission. 

An essential fungal gene rhay be defined as one which, when disrupted 
genetically (for example when not expressed) in a fungus, prevents survival or 
significantiy retards growth of the cell on minimal or defined medium, or in guinnea 
pigs, mice, rabbits or rats infected with the fungus. In one -embodiment the protein of 
the invention is able to complement such an effect of the genetic disruption. Thus the 
protein may cause survival (viabiUty) of a fungal ceU which does not express its native 
2031 oxidoreductase. 

A protein or polynucleotide of the invention (or a fungal "2031 OR" gene, 
nucleic acid or protein) may be defined by sitnilarity in sequence to a another member 
of the family. As mentioned above this similarity may be based on percentage identity 
(for example to the sequences shown in the sequence listing). 

A protein or polynucleotide of the invention may comprise one or more of the 
motifi defined by regions I - 11 of Figures 1 and 2 (marked at the top of the Figures) 
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of any of the sequences shown. Thus a protein of the invention may comprise one or 
more of motifs 1-11 as shown for SEQ ID NO:3 and a polynucleotide of the 
invention may comprise one or more of motifs 1—11 as shown for SEQ ID NO: 1 . 

Typically the motif is .present in substantially the same location as the equivalent 
5 location shown in Figure 1 or 2. The equivalent location can be deduced, for example, 
using any suitable algorithm mentioned herein. In one embodiment the protein or 
polynucleotide also comprises sequence flanking the motif as shown in Figures 1 or 2 
such as sequences of length at least 10, 20 or 30 amino acids/nucleotides flanking the 
N temiinal side and/or C terminal side, or 5' and/or 3' side, of the motif; or sequence 
1 0 which^has percentage identity with the flanking sequence. 

The protein of the invention typically comprises at least 2, 3, 5, 8 or 11 of the 
motifs shown in Figures 1 and 2. The protein preferably comprises at least motif no,6 
and/or motif no.9. 

The protein or polynucleotide of the invention may aUgn with other 2031 OR 
15 polynucleotides or proteins (as shown in SEQ ID Nos. 1-44 and 82-85) showing a 
greater identity to these than to Old Yellow Enzyme family polynucleotides or proteins 
The protein or polynucleotide of the invention typically clusters with other 2031 
OR polynucleotides or proteins (as shown in SEQ ID Nos. 1-44 and 82-85) rather than 
Old Yellow Enzyme family polynucleotides or proteins after phylogenetic analysis, for 
2 0 example with a bootstrap value of greater than 60%. 

In one embodiment the protein of the invention has a sequence which matches 
PFAM profile "oxidored FMN", or INTERPRO profile IPR001155 (for example with 
- an Evalue of e-50 or less) and is closer to a 2031 OR shown in any one of SEQ ID 
Nos, i -44 and 82-85 than to Old Yellow Enzyme family proteins. 
25 The protein or polynucleotide of the invention may be in isolated form* (such as 

non-cellular form), for example when used in the method of the invention. Preferably, 
the isolated polynucleotide comprises a 2031 OR gene. Preferably, the isolated protein 
comprises a 2031 OR. The polynucleotide may comprise native, synthetic or 
recombinant polynucleotide, and the protein may comprise native, synthetic or 
30 recombinant protein. The polynucleotide or protein may comprise combinations of 
native, synthetic or recombinant polynucleotide or protein, respectively. The 
polynucleotides and proteins of the invention may have a sequence which is the same 



as, or different firom, naturally occurring 2031 OR polyaucleotides and proteins. 

It is to be "understood that the temi "isolated from" may be read as "of herein. 
Therefore references to polynucleotides and proteins being "isolated. from" a particular 
organism include polynucleotides and proteins which were prepared by means other than 
obtaining them from the organism, such as synthetically or recombinantly. 

Preferably, the polynucleotide or protein, is isolated from a fungus, more 
preferably a filamentous ftmgus, even more preferably an Ascomycete. 

Preferably, the polynucleotide or protein, is isolated from an organism selected 
from Aspergillus; Blumeria; Candida; CoUetotrichium; Cryptococcus; 
Encephalitozoon; Fusariiim; Leptosphaeria; Magnaporthe\ Mycosphaerella; 
Neurospora, Phytophthora; Plasmopara; Pneumocystis', PyriculaHa; Pythium; 
Puccinia; Rhizoctonia; Schizosaccharomyces, Trichophyton; and Ustilago. 

Preferably, the polynucleotide or protein, is isolated from an organism 
independently selected from a group of genera consisting of Aspergillus, Candida, 
CoUetotrichium, Fusarium, Magnaporthe, Mycosphaerella, Neurospora, 
Schizosaccharomyces and Ustilago. 

Preferably, the polynucleotide or protein, is isolated from an organism selected 
from the species Aspergillus jflavus; Aspergillus fumigatus; Aspergillus nidulans; 
Aspergillus niger; Aspergillus parasiticus', Aspergillus terreus; Blumeria graminis; 
Candida albicans; Candida cruzei; Candida glabrata; Candida parapsilosis; Candida 
tropicalis; CoUetotrichium trifoUi; Cryptococcus neoformans; Encephalitozoon 
cuniculi; Fusarium graminarium; Fusarium solani; Fusarium sporotrichoides; 
Leptosphaeria nodorum; Magnaporthe grisea; Mycosphaerella graminicola; 
Neurospora erassa; Phytophthora. capsici; Phytophthora infestans; Plasmopara 
viticola; Pneumocystis jiroveci; Puccinia coronata; Puccinia graminis; Pyricularia 
oryzae; Pythium ultimum; Rhizoctonia solani; Schizzosaccharomyces pombe; 
Trichophyton interdigitale; Trichophyton rubrum; and Ustilago maydis. 

Preferably, the polynucleotide or protein, is isolated from an organism selected 
from Aspergillus fumigatus; Aspergillus nidulans, Candida albicans, CoUetotrichium 
trifolii, Fusarium gramineanim, Fusarium sporotrichoides, Magnaporthe grisea, 
Mycosphaerella graminicola, Neurospora crassa, Schizosaccharomyces pombe and 
Ustilago maydis. 
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The polynucleotide, and preferably ttie protein, may be isolated from A. 
fumigatus AF293, 



Table I. 2031 OR sequences claimed and their relationshin to sequences given ia the 
sequence listing. 





gDNA/EST' 


Coding 

sequence(cDNA/mRNA) 
w/o UTRs^ 


Protein 


A. jumigaiiis 
Oxidoreductase 2031 


SEQroNo. 1: 

299-469,520-1618 " 


SEQ ID No. 2: 115-1384 


.SEQ rp. No. 3 


A.fumigatus 
Oxidoreductase 4929 


SEQ ID No. 4: 

1-180, 267-1352 


SEQ ID No. 5: 1-1266 


SEQ ID No. 6 


A. fumigatus 
Oxidoreductase 1495 


SEQ ID No. 7: 
1-1329 


SEQ ID No. 7: 1-1329 


SEQ ID No. 8 


A. nidulans 1_112 


SEQ ID No. 9: 
1-1269 


SEQ ID No. 9: 
1-1269 


SEQ ID No. 10 . 


C. albicans 2431 


SEQ ID No. 11: 
1-1299 


SEQ ID No. 11 
i-1299 


SEQ ID No. 12 


C. albicans 2464 


SEQ ID No. 13: 1-1110 


SEQ ID No. 13: 1-1110 


SEQ ED No. 14 


K crassal<lCU07452A 


SEQ ID No. 15: 1-1305 


SEQ ID No, 15: 1-1305 


SEQ ID No. 16 


N. crassa Oxidoreductase 
NCU08900 


SEQ ID No. 17: 1-924,1015- 
1362,1435-1476 


SEQ ID No. 18: 1-1314 


SEQ ID No. 19 


M grisea MG04569.3 
(pred gene) 


SEQ ID No. 20: 1-726, 810- 

1412 


SEQ ID No. 21: 1-1329 


SEQ ID No.22 


S. pombe T39956 


SEQ ID No. 23: 1-1188 


SEQ ID No. 23: 1-1188 


SEQ ID No. 24 


C. tjifolii (EST assembly) 


SEQ ID No. 25: 130-777 


SEQ ID No. 26: 1-645 


SEQ ID No. 27 


F. sporotrichoides 
FsCon[0063] (ESTs) 


SEQ ID No, 28:103-803 


SEQ ID No. 29: 1-701 


SEQ ID No. 30 


F. sporotrichoides 
FsCoii[0237] (ESTs) 


SEQ ID No. 31: 76-631 (rev 
comp) 


SEQ 3D No. 32: 1-556 


SEQ ID No.33 


F, sporotrichoides 
FsCoii[0458] (ESTs) 


SEQ ID No. 34: 174-657 


SEQ ID No. 34: 174-657 


SEQrDNo.35 



# 



F, gramihearum 
1577I74I (EST) 


SEQIDNo. 36: 1-744 


SEQ ID No. 37: 1-742^^^ 


SEQIDNo.38 


F. graminearum 
' FG00074J 


SEQIDNo. 82: 
1-1326 


SEQ ID No. 82: M326 


SEQ ID No. 83 


M. graminicola mg[0281] 
(EST) 


SEQIDNo. 39: 1-647 


SEQIDNo. 39: 1-647- , . 


SEQ ID No.40 


M. graminicola mga0328f 
(EST) 


SEQIDNo. 41: 1-560 


SEQ ID No. 41: 1-560 


SEQ ID No.42 


M. g77jea MG03823.3 


SEQ.IDNo. 43: 1-1254 


SEQIDNo. 43: 1-1254 


SEQIDNo.44 


UsUlago maydis 
Contig 1,2 


SEQ ID No, 84: 
1-1350 


SEQ ID No. 84: 
1-1350 


SEQIDNo. 85 



Niunbers after SEQ ID Nos. correspond to bases of genomic DNA encoding the 
protein. 

^^^RNA sequences are given in the sequence listing with Thymidine (T), although it is 
uaderstood that in vivo Uridine (U) would be present. 



5 ^^^A one-base deletion at position 690 of the EST (SEQ ID No. 22) is required to give 
the best predicted cDNA/protein. 

^"^-^Two single base deletions are required to optimise translation. 

Bioinformatics analysis was carried out to identify functionally . important 
10 regions within the fungal 2031 ORs. The 2031 ORs are related to but distinct from the 
"Old Yellow Enzyme" (OYE) group of yeast enzymes, which also includes ergosterol- 
binding protein of Candida albicans. Compairison of the 2031 ORs with crystal 
structures of OYE family proteins identified highly conserved residues responsible for 
.the catalytic function of these enzymes. However, the comparisons also identified 
> 15 seven clusters of residues conserved in 2031 enzymes .but not OYE enzymes which 
flanked the substrate binding site and were therefore implicated in determining 
substrate specificity (regions 2, 4, 6, 7, 8, 10, aiid 11 in Figures 1 aad 2, and Example 4 
hereinafter). Four further conserved clusters of residues were identified which, while 
not predicted to be involved in catalysis, were conserved in 2031 but not OYE and so 
20 also distinguish 2031 ORs from OYEs (regions 1, 3, 5, and 9 in Figures 1 and 2, and 
Example 4 hereinafter). 

Variants of the above mentioned polynucleotides and proteins are also provided, and 



9 



are discussed below. 

Id one embodiment, the protein of the invention may comprise an araino acid 
sequence substantially as set out and independently selected from regions 1-11 of any 
of SEQ ID Nos 3, 6, 8, 10, 12, 14, 16, 19, 22, 24, 27, 30, 33, 35, 38, 40, 42, 44, 83 or 
5 85 as given in Figure 1, or variants thereof. At least one region or motif -may be 
fonctional. 

The polynucleotide of the invention may comprise DNA, such as genomic DNA. 
The polynucleotide may comprise a sequence substaatially as set out and 
independently selected from regions 1 - 11 of any of SEQ ID Nos. 1, 4, 7, 9, 11, 13, 

10 15, 17, 20, 23, 25, 28, 31, 34, 36, 39 41, 43, 82 or 84 as given in Figure 2, or 
complements, or variants thereof. 

Preferably, the polynucleotide encodes a fungal 2031 OR protein which 
comprises substantially the amino acid sequences SEQ ID Nos 3, 6, 8, 10, 12, 14, 16, 
19, 22, 24, 27, 30, 33, 35, 38, 40, 42, 83 or 85 or a variant thereof. 

15 The polynucleotide may comprise RNA, preferably mKNA, preferably spliced 

mENA. Preferably, the polynucleotide comprises substantially the sequence shown as 
SEQ ID Nos 2, 5, 7, 9, 11, 13, 15, 18, 21, 23, 26, 29, 32, 34, 36, 37, 39, 41, 43, 82 or 
84 or a complement, or a variant thereof. 

Preferably, the protein comprises substantially the sequences SEQ ID Nos. 3, 6, 

20 8, 10, 12, 14, 16, 19, 22, 24, 27, 30, 33, 35, 38, 40, 42, 44, 83 or 85 or a variant 
thereof. 

Preferably, the protein is encoded by the regions of sequences SEQ ID Nos. 1, 4, 
7, 9, 11, 13, 15, 17, 20, 23, 25, 26, 28, 29, 31, 34, 36, 39, 41, 43, 82 or 84 as described 
in Figure 1. in the column "gDNA/EST" in Table I, or a complement, or a variant 
25 thereof 

The polynucleotide may comprise substantially a nucleotide sequence region or 
motif independently selected from at least one of regions 1-11 from at least one of the 
sequences SEQ ID Nos. 1, 2, 4, 5, 7, 9, 11, 13, 15, 17, 18, 20, 21, 23, 25, 26, 28, 29, 
31, 32, 34, 36, 37, 39, 41, 43, 82 or 84, as given in Figure 2, or a complement, or a 
3 0 variant thereof. 



Preferably, the isolated polynucleotide comprises substantially a nucleotide 
sequence independently selected from the regions and sequences given in the colunm 
"gDNA/EST" in Table L 

Preferably, the protein is encoded by a polynucleotide which polynucleotide 
comprises substantially a sequence iadependently selected from at least one of the, the 
regions and sequences given ia the colunm "gDNA/EST" in Table I, or a complement 
or, a variant thereof. 

By the term "native amino acid/polynucleotide/protein", is meant an amino acid, 
polynucleotide or proteta produced naturally from biological sources either in vivo or . 
in vitro. 

By the term "synthetic amino acid/polynucleotide/protein", is meant an amino ^ 
acid, polynucleotide or protein which has been produced artificially or de novo using a 
DNA or proteiQ synthesis machine known ia the art. 

By the term "recombiaant amino acid/polynucleotide /protein", is meant an 
amino acid, polynucleotide or proteia which has been produced using recombitiant 
DNA or proteiQ technology or methodologies which are known to the skilled 
technician. 

The term 'Variant", and the terms "substantially the amino 
acid/polynucleotide/protein sequence" are used herein to refer to related sequences. 
As discussed below such related sequences are typically homologous to (share 
percentage identity with) a given sequence, for example over the entire length of the 
sequence or over a portion of a given length. The related sequence- may also be a 
fragment of the sequence or of a homologous sequence. A variant protein may be 
encoded by a variant polynucleotide. 

By the term "variant", and the terms "substantially the atnino 
acid/polynucleotide/protein sequence", we mean that the sequence has at least 30%, 
preferably 40%, more preferably 50%, and even more preferably, 60% sequence identity 
with the amino acid/polynucleotide/proteia sequences of any one of the sequences referred 
to. A sequence which is "substantially the amino acid/polynucleotide/peptide sequence" 
may be the same as the relevant sequence. 

Calculation of percentage identities between different amino 
acid/polynucleotide/protein sequences may be carried out as follows. A multiple 



aligmnent is first generated by the ClxxstalX program (pairwise parameters: gap 
opeimng 10.0, gap extension 0.1, protein matrix Gonnet 250, DNA matrix lUB; 
multiple parameters: gap opening 10.0, gap extension 0.2, delay divergent sequences 
30%, DNA transition weight O.S, negative matrix off, protein matrix gonnet series, 
DNA weight lUB; Protein gap parameters, residue-specific penalties on, hydrophiHc 
penalties on, hydrophilic residues GPSNDQERK, gap separation distance 4, end gap 
separation off). The percentage identity is then calcluated firom the multiple aligrmaent 
as (N/T)*100, where N is the number of positions at wlxich the two sequences share an 
identical residue, and T is the total number of positions compared. Alternatively, 
percentage identity can be calculated as (N/S)*100 where S is the length of the shorter 
sequence being compared. The amino acid/polynucleotide/protein seqences may be 
synthesised de novOy or may be native amino acid/polynucleotide/protein sequence, or 
a derivative thereof 

An amino acid/polynucleotide/protein sequence with a greater identity than 65% 
to any of the sequences referred to is also envisaged. An amino 
acid/polynucleotide/protein sequence with a greater identity than 70% to any of the 
sequences referred to is also envisaged. An amino acid/polynucleotide/protein 
sequence with a greater identity than 75% to any of the sequences referred to is also 
envisaged. An amino acid/polynucleotide/protein sequence with a greater identity than 
80% to any of the sequences referred to is also envisaged. Preferably, the amino 
acid/polyQUcleotide/protein sequence has 85% identity with any of the sequences 
referred to, more preferably 90% identity, even more preferably 92% identity, even 
more preferably 95% identity, even more preferably 97% identity, even more 
preferably 98% identity and, most preferably, 99% identity with any of the referred to 
sequences. 

The above mentioned percentage identities may be measured over the entire 
length of the original sequence or over a, region of 15, 20, 50 or 100 amino acids/bases 
of the original sequence. In a preferred embodiment percentage identity is measured 
with reference to SEQ ID No. 3. Preferably the variant protein has at least 40% 
identity, such as at least 60% or at least 80% identity with SEQ ID No. 3 or a portion 
ofSEQIDNo. 3. 

Alternatively, a substantially similar nucleotide sequence will be encoded by 
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sequence which hybridizes to the sequences shown in SEQ ID Nos, 1,2, 4, 5, 7, 8, 9, 11, 

13, 15, 17, 18, 20, 21, 23, 25, 26, 28, 29, 31, 32, 34, 36, 37, 39, 41, 43, 82 or 84 ox their 
complements under stringent conditions. By stringent conditions, we mean the nucleotide 
hybridises to filter-bound DNA or ENA in 6x sodium chloride/sodium citrate (SSC) at 
approxmiately 45*^0 followed by at least one wash in 0.2x SSC/0.1% SDS at approxitnately 
5-65*'C, Altematively, a substantially similar protein may differ by at least 1, but less than 
5, 10, 20, 50 or 100 amino acids firom the sequences shown in SEQ ID Nos. 3, 6, 8, 10^ 12, 

14, 16, 19, 22, 24, 27, 30, 33, 35, 38, 40, 42, 44, 83 or 85. Such differences may each be 
additions, deletions or substitutions. 

Due to the degeneracy of the genetic code, it is clear that any nucleic acid 
sequence could be varied or changed without substantially affecting the sequence of 
the protein encoded thereby, to provide a functional variant thereof. Suitable 
nucleotide variants are those having a sequence altered by the substitution of different 
codons that encode the same amino acid within the sequence, tbus producing a silent 
change. 

Other suitable variants are those having homologous nucleotide sequences but 
comprising all, or portions of, sequence which are altered by the substitution of 
different codons that encode an amino acid with a side chain of similar biophysical 
properties to the amino acid it substitutes, to produce a conservative change. For - 
example small non-polar, hydrophobic amino acids include glycine, alanine, leucine, 
isoleucine, valine, proline, and methionine. Large non-polar, hydrophobic amino acids 
include phenylalanine, tryptophan and tyrosine. The polar neutral amino acids include 
serine, threonine, cysteine, asparagine and glutamine. The positively charged (basic) 
amino acids include lysine, arginine and histidine. The negatively charged (acidic) 
amino acids include aspartic acid and glutamic acid. Certain organisms, including 
Candida are known to use non-standard codons compared to those used in the majority 
of eukaryotes. Any comparisons of polynucleotides and proteins jfrom such organisms " 
with the sequences given here should take these differences into account. 

In accurate alignment of protein or DNA sequences the trade-off between 
optimal matching of sequences and the introduction of gaps to obtain such a match is 
important. In the case of proteins, flie means by which matches are scored is also of 
significance. The family of PAM matrices (e.g., Dayhoff, M. et al., 1978, Atlas of 
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protein sequence and structure, NatL Biomed. Res, Found.) and BLOSUM matrices 
quantitate the nature and likelDiood of conservative substitutions and are used in 
- multiple aligmnent algorithms, although other, equally applicable matrices wiU be 
known to those skilled in the art.. the popular multiple aligmnent program ClustalW, 
5 and its windows version ClustalX (Thompson et al., 1994, Nucleic Acids Research, 22, 
4673-4680; Thompson et al., 1997, Nucleic Acids Research, 24, 4876-4882) are 
efficient ways to generate multiple alignments of proteins and DNA. 

Use of the Align program is also preferred (Hepperle, D., 2001: Multicolor 
Sequence Aligmnent Editor. Institute of Freshwater Ecology and Inland -Fisheries, 
10 16775 Stechhn, Germany), although others, such as JalView or Cinema are also 
suitable. 

Calculation of percentage identities between proteins occurs during the 
generation of multiple aligmnents by Clustal. However, these values need to be 
recalculated if the alignment has been manually improved, or for the deliberate 

15 comparison of two sequences. Programs that calculate this value for pairs of protein 
sequences within an alignment include PROTDIST within the PHYLIP phylogeny 
package (Felsenstein; http://evolution.gs.washington.edu/ phylip.html) using the 
"Similarity Table" option as the model for amino acid substitution (P). For DNA/RNA, 
an identical option exists within the DNADIST program of PHYLIP. 

20 Other modifications in protein sequences are also envisaged and within the scope 

of the claimed invention, i.e. those which occur during or after translation, e.g. by 
acetylation, amidation, carboxylation, phosphorylation, proteolytic cleavage or hnkage 
to aligand.,. 

The term "variant", and the terms "substantially the amino 
25 acid/polynucleotide/protein sequence" also include a fragment of the relevant 
polynucleotide or protein sequences, including a fragment of the homologous sequences 
(which hay;e -percentage identity to a specified sequence) referred to above. A 
polynucleotide fragment will typically comprise at least 10 bases, such, as at least 20, 30, 
50, 100, 200, 500 or 1000 bases. A protein fragment will typically comprise at least 10 
30 amino acids, such as at least 20, 30, 50, 80, 100, 150, 200, 300, 400 or 500 amino acids. 
The fragments may lack at least 3 amino acids, such as at least 10, 20 of 30 amino acids of 
the amino acids from either end of the protein. 
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The invention provides a method of screening which may be used to identify 
modulators of 2031 OR proteins or polynucleotides, such as inhibitors of expression or 
activity of the proteins or polynucleotides of the invention. In one embodiment of the 
method a candidate substance is contacted with a protein or polynucleotide of the 
invention and whether or not the candidate substance binds or modulates the protein or 
polynucleotide is determined. 

The modulator may promote (agonise) or inhibit (antagonise) the activity of the 
protein. A therapeutic modulator (against fungal infection) will inhibit the expression 
or activity of protein or polynucleotide of the invention. 

The method may be carried out in vitro (inside or outside a cell) or in vivo. In 
one embodiment the method is carried out on a cell, or cell culture cell extract. The 
cell may or may not be a cell in which the polynucleotide or protein is naturally 
present. The cell may or may not be a fungal cell, or may or may not be a cell of any 
of the fungi mentioned herein. The protein or polynucleotide may be present in a non- 
cellular form in the method, thus the protein may be in the form of a recombinant 
protein purified from a cell. 

Any suitable binding or activity assay may be used. Methods which determine 
whether a candidate substance is able to bind the protein or polynucleotide may 
comprise providing the protein or polynucleotide to a candidate substance and 
determining whether binding occurs, for example by measuring the amount of the 
candidate substance which binds the protein or polynucleotide. The binding may be 
determined by measuring a characteristic of the protein or polynucleotide that changes 
upon binding, such as spectroscopic changes. The binding may be determined by 
measuring reaction substrate or product levels in the presence and absence of the 
candidate and comparing the levels. 

The assay format may be a 'band shift' system. This involves determining 
-whether a test candidate advances or retards the protein or polynucleotide on gel 
electrophoresis relative to the absence of the compound. 

The method may be a competitive binding method. This determines whether the 
candidate is able to inhibit the binding of the protein or polynucleotide to an agent 
which is known to bind to the protein or polynucleotide, such as an antibody specific 
for the protein, or a substrate of the protein. 
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Whether or not a candidate substance modulates the activity of the protein may 
be detennined by providing the candidate substance to the protein under conditions 
that permit activity of the protein, and deternuning whether the candidate substance is 
able to modulate the activity of the product. 
5 The activity which is measured may be any of the activities of the protein of the 

invention mentioned herein, siich as oxidoreductase activity. In one embodiment the 
screening method comprising carrying out a redox reaction in the presence and 
absence of the candidate substance to determine whether the candidate substance 
inhibits the oxidoreductase activity of the protein of the invention, wherein the redox 

10 reaction is carried out by contacting said protein with NADH or NADPH; and an 
.electron acceptor, xmder conditions in which in the absence of the candidate substance 
the protein catalyses reduction of the electron acceptor. 

In a preferred embodiment the irihibition of the redox reaction is measured by 
detecting the amount of NADH or NADPH oxidation, for example by measuring the 

15 generation of the oxidised forms of NADH and NADPH spectroscopically. This can 
be done by measurement at 340nm (see Example 7). 

Alternatively, a suitable colourimetric oxidoreductase substrate may be used to 
measure inhibition, such as methylene blue, phenazine methosulphate or 2, 6- 
dichlorophenolindophenol. 

20 Suitable candidate substances which can tested in the above methods include 

antibody products (for example, monoclonal and polyclonal antibodies, single chain 
antibodies, chimeric • antibodies and CDR-grafted antibodies). Furthermore, 
combinatorial hbraries, defined chemical identities, peptide and peptide mimetics, 
oHgonucleotides and natural product libraries, such as display libraries (e.g. phage 

25 display hbraries) may also be tested. The candidate substances may be chemical 
compounds. Batches of the candidate substances may be used in an initial screen of, 
for example, ten substances per reaction, and the substances from batches which show 
inhibition tested individually. 

According to a further aspect of the present invention, there is. provided a 

30 polynucleotide or protein of the invention for use as a medicament or in diagnosis. 

The polynucleotide or protein may be modified prior to use, preferably to 
produce a derivative or variant thereof The polynucleotide or protein may be 



derivatised. The protein may be modified by epitope tagging, addition of fusion 
partners or pniification tags such as glutathione iS^-transferase, multiple histidines or 
maltose binding protein, addition of green fluorescent protein, covalent attachment of 
molecules including biotin or fluorescent tags, incorporation of selenomethionine, 
inclusion or attachment of radioisotopes or fluoresceht/non-fltxorescent lanfhanide 
chelates. The polynucleotide may be modified by methylation or attachment of 
digoxygehin (DIG) or by addition of sequence encoding the above tags, proteins or 
epitopes. 

Preferably, the medicament is adapted to retard or prevent a fungal infection. 
The fungal infection may be in human, animal or plant. The polynucleotide or protein 
may be used for the development of a drug. The polynucleotide or proteiu may be used 
ia, or for the generation of, a molecular model of said polynucleotide or said protein. 

According to a further aspect of the present iavention, there is provided use of a 
polynucleotide or protein of the invention for the preparation of a medicament for the 
treatment of a fungal infection. 

The polynucleotide or protein may be modified prior to use, preferably to 
produce a derivative or variant thereof The polynucleotide or protein may be 
derivatised. The polynucleotide or protein may not be modified or derivatised. 

Preferably, the medicament is adapted to retard or prevent a fungal infection. 
The treatment may comprise retarding or preventing fungal infection. Preferably, the 
drug and/or medicament comprises an inhibitor, preferably a 2031 OR inhibitor. 
Preferably, the drug or medicament is adapted to inhibit expression and/or activity of 
the polynucleotide or a fragment thereof, and/or the fimction of the protein or a 
fragment thereof. - - . 

Preferably, the fungal infection comprises an infection by a fungus, more 
preferably an Ascomycete, and even more preferably, an organism selected from the 
genera Aspergillus; Bhimeria; Candida; Colletotrichium; Cryptococcus; 
Encephalitozoon; Fusarium; Leptosphaeria; Magnaporthe\ Mycosphaerella; 
Neurospora, Phytophthora; Plasmopara; Pneumocystis] Pyricularia; Pythium; 
Puccinia; Rhizoctonia; Schizosaccharomyces, Trichophyton; and Ustilago. 

Preferably, the fungal infection comprises an infection by an organism selected 
from the genera Aspergillus, Candida, Colletotinchium, Fusarium, Magnaporthe, 
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Mycosphderella and Ustilago, 

Preferably, the fungal infection comprises an infection by an organism selected 

from the species Aspergillus flavus; Aspergillus fumigatus; Aspergillus nidulans; 

Aspergillus niger; Aspergillus parasiticus', Aspergillus terreus; Blumeria graminis; 
5 Candida albicans; Candida cruzei; Candida glahrata; Candida parapsilosis; Candida 

tropicalis; ColVetotrichium trifolii; Cryptococcus neoformans; Encephalitozoon 

cuniculi; Fusarium graminarium; Fusarium solani; Fusarium sporotrichoides; 

Leptosphaeria nodorum; Magnaporthe grisea; Mycosphaerella graminicola; 
' ' Phytophthora capsici; Phytophthora infestans; Plasmopara viticola; Pnewnocystis 
10 jii-oveci; Puccinia coronata; Puccinia graminis; Pyricularia oryzae; Pythium ultirnum; 

Rhizoctonia solani\ Trichophyton inter digitale; Trichophyton rubrum; and Ustilago 

maydis. 

Preferably, the fungal infection comprises an infection by an organism selected 
from the species Aspergillus jumigatus\ Aspergillus nidulans, Candida albicans, 
15 Colletotrichium trifolii, Fusarium graminearum, Fusarium sporotrichoides, 
Magnaporthe grisea, Mycosphaerella graminicola and Ustilago maydis. 

According to another aspect of the present invention, there is provided a method 
of detecting the presence of a fungal infection in an individual, said method 
comprising:- 

2 0 (i) obtaining a sample from an organism; and 

(ii) detecting in the said sample the presence of a polynucleotide or protein of 

the invention. 

The individual may be a person (human) or animal (such as a mammal or bird) 

or a plant. The ftmgal infection may arise from infection with an organism selected 
25 from the genera Aspergillus; Blumeria; Candida; Colletotrichium; Cryptococcus; 

Encephalitozoon; Fusarium; Leptosphaeria; Magnaporthe; Mycosphaerella; 

Phytophthora; Plasmopara; Pneumocystis; Pyricularia; Pythium; Puccinia; 

Rhizoctonia; Trichophyton; and Ustilago 

The fungal infection may arise from infection with an organism selected from 
30 the species Aspergillus flavus; Aspergillus fumigatus; Aspergillus nidulans; 

Aspergillus niger; Aspergillus parasiticus', Aspergillus terr^eus; Blujnei^ia gj-aminis; 

Candida albicans; Candida cruzei; Candida glabrata; Candida parapsilosis; Candida 
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tropicalis; Colletotrichhim trifolii; Cryptococciis neoformans; Encephalitozoon 
cuniculi\ Fusarium graminaiHum; Ftisarium solani; Fusarium sporotrichoides; 
Leptosphaeria nodorum; Magnaporthe grisea; Mycosphaerella graminicola; 
Phytophthora capsici; Phytophthora infestans; Plasmopara viticola; Pneumocystis 
jiroveci; Puccinia coronata; Puccinia graminis; Pyricularia oryzae; Pythium ultimum; 
Rhizoctonia solani\ Trichophyton interdigitale; Trichophyton rubrum; and Ustilago 
maydis. 

Preferably, the sample comprises a biological sample which, preferably, 
comprises nucleic acid aad/or protein. In one embodiment of the method the nucleic 
acid or protein is purified (at least partially) from the sample before the detection is 
performed. 

Where the organism is Aspergillus fumigatus, Aspergillus nidulans or 
Aspergillus niger, the sample may comprise sputum, bronchoalveloar lavage, urine, 
respiratory specimens, endotracheal aspirates, sterile specimens obtained by an 
invasive procedure such as vitreous tap, tympanocentesis, brain biopsy or aspiration, 
nasal or sinus specimens, blood, tissue or autopsy. 

Where the organism is Magnaporthe grisea the sample may comprise rice leaf or 
rice stem. 

Preferably, said detecting of the presence in the said sample of a polynucleotide 
as defined by the first or third aspect comprises use of at least one oligonucleotide pair 
adapted to be used for amplification of DNA, preferably genomic, more preferably, 
fungal genomic DNA. The amplification may be PGR amplification. 

Preferably, the PGR amplification employs at least one primer pair comprising a 
polynucleotide selected from the group consisting of: 

Aspergillus fiimigatiis\ SEQ ID Nos 67 and 68 for SEQ ID No. 1; SEQ ID Nos 
69 and 70 for SEQ ID No. 4; and SEQ ID Nos 71 and 72 for SEQ ID No. 7. 
Candida albicans', SEQ ID Nos 73 and 74 for SEQ ID No. 11. 
Magnaporthe grisea\ SEQ ID Nos 75 and 76 for SEQ ID No. 20. 

Preferably, said detecting comprises subjecting the amplified DNA to size 
analysis, preferably, electrophoresis and, preferably, comparing the results to a positive 
control and, preferably, a negative control. Said detecting may also comprise 
sequencing of the amplified DNA to demonstrate the correct sequence. 



Preferably, said detecting of - the presence in the said sample of a protein 
comprises use of a monoclonal or polyclonal antibody directed to part or all of the 
protein of the invention. 

According to a further aspect of the present invention, there is provided a 
recombinant DNA molecule or vector comprising a polynucleotide of the invention. 

The recombinant DNA molecule or vector may comprise an expression cassette. 
Preferably, the recombinant DNA molecule or vector comprises an expression vector. 
Preferably, the polynucleotide sequence is operatively linked to an expression control 
sequence. A suitable control sequence may comprise a promoter, aa enhancer etc. 

According to another aspect of the present iavention, there is provided a cell 
containing a polynucleotide, recombinant DNA molecule or vector of the invention. 

. The cell may be transformed or transfected with the polynucleotide, recombinaat 
DNA molecule or vector by suitable means. Preferably, the cell produces a 
recombinant protein of the invention. 

The invention also provides aa organism which is transgenic for the 
polynucleotide of the invention (whose cells may be the same as the cells of the 
invention mentioned herein). Such an organism is typically a fungus, such as any - 
genera or species of fungus mentioned herein. The organism may be microorganism, 
such as a bacterium, virus or yeast. The organism may be a plant, animal (including 
birds and mammals), such as any of the animals mentioned herein. 

The organism may be produced by introduction of the polynucleotide of the 
invention into a cell of the organism, and in the case of a multicellular organism allowing 
the cell to grow into a whole organism. 

According to a further aspect of the present invention, there is provided a cell in 
which a native polynucleotide or protein of the invention protein is non-functional 
and/or inhibited. The cell may be of, or present in, a multicellular organism. 

The cell may be a mutant cell. The cell is typically a fungal cell, such as of any 
genera or species of fungus mentioned herein. A preferred means of generating the ceU is 
to modify the polynucleotide of the invention, such that the polynucleotide is non- 
functional. This modification may be to cause a mutation, which disrupts the expression or 
function of a gene product. Such mutations may be to the nucleic acid sequences that act as 
5' or 3' regulatory sequences for the polynucleotide, or may be a mutation introduced into 
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the coding sequence of the polynucleotide. Functional deletion of the polynucleotide may 
be, for example, by mutation of the polynucleotide in the form of nucleotide substitution, 
addition or, preferably, nucleotide deletion. 

The polynucleotide may be made non-functional and/or inhibited by: 

(i) shifting the reading frame of the coding sequence of the polynucleotide; 

(ii) adding, substituting or deleting amino acids in the protein encoded by the 
polynucleotide; or 

(iii) partially or entirely deleting the DNA coding for the polynucleotide and/or 
the upstream and downstream regulatory sequences associated with the polynucleotide. 

(iv) inserting DNA into the coding or non-coduig regions. 

A preferred means of introducing a mutation into a polynucleotide is to utilize 
molecular biology techniques specifically to target the polynucleotide which is to be 
mutated. Mutations may be induced using a DNA molecule. A most preferred means 
of introducing a mutation is to use a DNA molecule that has been especially prepared 
such that homologous recombination occurs between the target polynucleotide and the 
DNA molecule. When this is the case, the DNA molecule, which may be double 
stranded, may contain base sequences similar or identical to the target polynucleotide 
to allow the DNA molecule to hybridize to (and subsequently recombine. with) the 
target. 

It is also possible to provide a cell in which the polynucleotide is non-functional 
and/or inhibited without introducing a mutation into the gene or its regulatory regions. 
This may be done by using specific iohibitors. Examples of such inhibitors include agents 
that prevent transcription of the polynucleotide, or prevent translation, expression or 
disrupt post-translational modification. Altematively, the inhibitor , may be an agent that 
increases degradation of the gene product (e.g. a specific proteolytic enzyme). Equally, the 
inhibitor may be an agent which prevents the polynucleotide product from functioning, 
such as neutralizing antibodies (for instance an anti-2031 OR antibody). The inhibitor may 
also be an antisense oligonucleotide, or any synthetic chemical capable of inhibiting 
expression of the gene or the stability and/or function of the protein. The inhibitor may also 
be a protein which interacts with the 2031 OR to prevent its function. The inhibitor may 
also be an RNA molecule which causes inhibition by RNA interference. In one 
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embodiment the antisense polynucleotide or RNA molecule which causes UNA 
interference are examples of pdlynucleblides of the iaventibia. 

According to a further aspect, there is provided an antibody exhibiting 
immunospecificity for a protein of the invention. The antibody may be used as a 
diagnostic reagent. 

The antibody may be monoclonal or polyclonal, and may be raised in mouse, rat, 
rabbit, chicken, turkey, horse, goat or donkey. The antibody may be raised against one 
or all of the proteras together, or may be raised against proteolytic or recombinant 
fragments. 

For the purposes of this iavention, the term "antibody", imless specified to the 
contrary, includes fragments which bind a proteia of the invention. Such fragments 
include Fv, F(ab') and F(ab')2 fragments, as well as single chaiQ antibodies. 
Furthermore, the antibodies and fragment thereof may be chimeric antibodies, CDR- 
grafted antibodies or humanised antibodies. 

Admioistration 

The formulation of any of the therapeutic substances (e.g. proteins, 
polynucleotides or modulators) mentioned herein will depend upon factors such as the 
nature of the substance and the condition to be treated. Any such substance may be 
administered in a variety of dosage forms. It may be administered orally (e.g. as 
tablets, troches, lozenges, aqueous or oily suspensions, dispersible powders or 
granules), parenterally, subcutaneously, intravenously intramuscularly, intrastemally, 
transdermally or by iufusion techniques. The substance may also be administered as 
suppositories. A physician will be able to determine the reqxdred route of 
administration for each particular patient. 

Typically the substance is formulated for use with a pharmaceutically acceptable 
carrier or diluent. The pharmaceutical carrier or diluent may be, for example, an 
isotonic solution. For example, solid oral forms may contain, together with the active 
compound, diluents, e.g. lactose, dextrose, saccharose, cellulose, com starch or potato 
starch; lubricants, e.g. silica, talc, stearic acid, magnesium or calcium stearate, and/or 
polyethylene glycols; binding agents; e.g. starches, arabic gums, gelatin, 
methylcellulose, carboxymethylcellulose or polj^inyl pyrrolidone; disaggregating 
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agents, e.g. starcli, alginic acid, alginates or so(3iiim starch glycolate; effervescing 
niixtures; dyestuffs; sweeteners; wetting agents, such as lecithin, polysorbates, 
laurylsulphates; and, in general, non-toxic and pharmacologically inactive substances 
used in pjiamiaceutical formulations. Such pharmaceutical preparations may be 
manufactured in known manner, for example, by means of mixing, granulating, 
tablettmg, sugar-coating, or film coating processes. 

Liquid dispersions for oral administration may be syrups, emulsions and 
suspensions. The. symps may contain as carriers, for. example, saccharose or 
saccharose with glycerine and/or mannitol and/or sorbitol. Suspensions and emulsions 
may contain as carrier, for example a natural gum, agar, sodium alginate, pectin, 
methylcellulose, carboxymethylcellulose, or polyvinyl alcohol. The suspensions or 
solutions for intramuscular injections may contaiQ, together with the active compound, 
a pharmaceutically acceptable carrier, e.g. sterile water, olive oil, ethyl oleate, glycols, 
e.g. propylene glycol, and if desired, a suitable amount of Hdocaine hydrochloride. 

Solutions for intravenous or infusions may contain as carrier, for example, sterile 
water or preferably they may be in the form of sterile, aqueous, isotonic saline 
solutions. 

A therapeutically ejBfective non-toxic amoimt of substance is adnodnistered. The 
dose may be determined according to various parameters, especially according to the 
substance used; the age, weight and condition of the patient to be treated; the route of 
administration; and the required regimen. Again, a physician will be able to determine 
the required route of administration and dosage for any particular patient. A typical 
daily dose is from about 0.1 to 50 mg per kg, preferably from about O.lmg/kg to 
lOmg/kg of body weight, according to the activity of the specific iohibitor, the age, 
weight and conditions of the subject to be treated, the type and severity of the disease 
and the frequency and route of administration. Preferably, daily dosage levels are 
from 5 mg to 2 g. 

Agricultural use 

Modulators identified by the method of the invention may be administered to plants ia 
order to prevent or treat fungal infections. The modulators are normally applied in the 
form of compositions together with one or more agriculturally acceptable carriers or 



diluents and can be applied to the crop area or plant to be treated, simultaneously or in 
succession with further compounds. 

The modulators of the invention can be apphed together with carriers, surfactants 
or application-promoting adjuvants customarily employed in the art of formulation. 
Suitable carriers and diluents correspond to substances ordinarily employed in 
formulation technology, e.g. natural or regenerated mineral substances, solvents, 
dispersants, wetting agents, tackifiers, binders or fertilizers. 

A preferred method of applying the modulators of the present invention or an 
agrochemical composition which contains them is leaf application. The number of 
applications and the rate of application depend on the intensity of infection by the 
fungus. However, the active ingredients can also penetrate the plant through the roots 
via the soil (systemic action) by impregnating the locus of the plant with a liquid 
composition, or by applying the compoxmds in solid form to the soil, e.g. in granular 
form (soil appUcation). The active ingredients may also be applied to seeds (coating) 
by impregnating the seeds either with a liquid formulation containing active 
iagredients, or coating them with a solid formulation. la special cases, further types of 
application are also possible, for example, selective treatment of the plant stems or 
buds. 

The active ingredients are used in unmodified form br, preferably, together with 
the adjuvants conventionally employed in the art of formulation, and are therefore 
formulated in known manner to emulsifiable concentrates, coatable pastes, directly 
sprayable or dilutable solutions, dilute emulsions, wettable powders, soluble powders, 
dusts, granulates, and also encapsulations, for example, in polymer substances. Like 
the nature of the compositions, the methods of application, such as spraying, 
atomiziag, dusting, scattering or pouring, are chosen in accordance with the intended 
objectives and the prevailing circumstances. Advantageous rates of apphcation are 
normally from 50g to 5kg of active ingredient (a.i.) per hectare ("ha", approximately 
2.471 acres), preferably from lOOg to 2kg a.i./ha, most preferably from 200g to 500g 
a.i./ha. 

The formulations, compositions or preparations containing the active ingredients 
and, where appropriate, a sohd or liquid adjuvant, are prepared in known manner, for 
example by homogeneously mixing and/or grinding active ingredients with extenders, 



for example solvents, solid carriers and, where appropriate, surface-active compounds 
(surfactants). 

Suitable solvents include aromatic hydrocarbons, preferably the fractions having 
8 to 12 carbon .'.atoms, for example,^ xylene mixtures or substituted naphthalenes, 
phthalates such as dibutyl phthalate or dioctyl phthalate, aliphatic hydrocarbons such 
as cyclohexane or paraffins, alcohols and glycols and their ethers and esters, such as 
ethanol, ethylene glycol, monomethyl or monoethyl ether, ketones such, as 
cyclohexanone, strongly polar solvents such as N-methyl-2-pyrrolLdone, dimethyl 
sulfoxide or dimethyl formamide, as well as epoxidized vegetable oils such as 
epoxidized coconut oil or soybean oil; or water. 

The sohd carriers used e.g. for dusts and dispersible powders, are normally 
natural mineral fillers such as calcite, talcum, kaolin, montmorillonite or attapulgite. 
In order to improve the physical properties it is also possible to add highly dispersed 
silicic acid or highly dispersed absorbent polymers. Suitable granulated adsorptive 
carriers are porous types, for example pumice, broken brick, sepiolite or bentonite; and 
suitable nonsorbent carriers are materials such as calcite or sand. In addition, a great 
nmnber of pregranulated materials of inorganic or organic nature can be used, e.g. 
especially dolomite or pulverized plant residues. 

Depending on the nature of the active ingredient to be used in the formulation, 
suitable surface-active compounds are honionic, cationic and/or anionic srufactants 
having good emulsifying, dispersing and wetting properties. The term "surfactants" 
will also be understood as comprising mixtures of surfactants. 

Suitable anionic surfactants can be both water-soluble soaps and water-soluble 
synthetic surface-active compounds. Suitable soaps are the alkali metal salts, alkaline 
earth metal salts or unsubstituted or substituted anunonium salts of higher fatty acids 
(chains of 10 to 22 carbon atoms), for example the sodium or potassium salts of oleic 
or stearic acid, or of natural fatty acid mixtures which can be obtained for example 
from coconut oil or tallow oil. The fatty acid methyltaurin salts may also be used. 

More frequently, however, so-called synthetic surfactants are used, especially 
fatty sulfonates, fatty sulfates, sulfonated benzimidazole derivatives or 
alkylarylsulfonates. The fatty sulfonates or sulfates are usually in tihe form of alkali 
metal salts, alkaline earth metal sa:lts or unsubstituted or substituted ammoniums salts 



and have a 8 to 22 carbon alkyl radical which also includes the alkyl moiety of alkyl 
radicals, for example, the spdinm or calcium salt of Ugnonsulfonic acid, of 
dodecylsulfate or of a mixture of fatty alcohol sulfates obtained from natural fatty 
acids. These compounds also comprise the salts of sulfuric acid esters and sulfonic 
acids of fatty alcohol/ethylene oxide adducts. The sulfonated benzimidazole 
derivatives preferably contain 2 sulfonic acid groups and one fatty acid radical 
containing 8 to 22 carbon atoms. Examples of aUcylarylsulfonates are the sodium, 
calcium or triethanolarnine salts of dodecylbenzenesulfonic acid, 
dibutylnaphthalenesulfonic acid, or of a naphthalenesulfomc acid/formaldehyde 
condensation product. Also suitable are corresponding phosphates, e.g. salts of the 
phosphoric acid ester of an adduct of p-nonylphenol with 4 to 14 moles of ethylene 
oxide. 

Non-ionic surfactants are preferably polyglycol ether derivatives of aliphatic or 
cycloaliphatic alcohols, or saturated or unsaturated fatty acids and alkylphenols, said 
derivatives containing 3 to 30 glycol ether groups and 8 to 20 carbon atoms in the 
(aliphatic) hydrocarbon moiety and 6 to 18 carbon atoms in the alkyl moiety of the 
alkylphenols. 

Further suitable non-ionic surfactants are the water-soluble adducts of 
polyethylene oxide Avith polypropylene glycol, ethylenediamine propylene glycol and 
alkylpolypropylene glycol containing 1 to 10 carbon atoms in the alkyl chain, which 
adducts contain 20 to 250 ethylene glycol ether groups and 10 to 100 propylene glycol 
ether groups. These compounds usually contain Y to 5 ethylene glycol imits per 
propylene glycol unit. 

Representative examples of non-ionic surfactants are 
nonylphenolpolyethoxyethanols, castor oil polyglycol ethers, 

polypropylene/polyethylene oxide adducts, tributylphenoxypolyethoxyethanol, 
polyethylene glycol and octylphenoxyethoxyethanoL Fatty acid esters of 
polyoxyethylene sorbitan and polyoxyethylene sorbitan trioleate are also suitable non- 
ionic surfactants. 

Cationic surfactants are preferably quaternary anunonirun salts which have, as 
N-substituent, at least one C8-C22 alkyl radical and, as further substituents, lower 
unsubstituted or halogenated alkyl, benzyl or lower hydroxyaUcyl radicals. The salts 
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are preferably in the form of lialid.es, methylsulfates or ethylsulfates, e.g. 
stearylirimethylarnmoriitiiii chloride or ben2yldi(2-cMoroe1hyl)ethylairmiomm3i 
bromide. 

The siirfactants customarily employed in the art of formulation are described, for 
example, in 'McCutcheon's Detergents and Emulsifiers Annual", MC Publishiug 
Corp. Ringwood, New Jersey, 1979, and Sisely and Wood, "Encyclopaedia of Surface 
Active Agents," Chemical Publishing Co,, Inc. New York, 1980. 

The agrocheroical compositions usually contain from about 0,1 to about 99% 
preferably about 0.1 to about 95%, and most preferably from about 3 to about 90% of 
the active ingredient, from about 1 to about 99.9%, preferably from about 1 to 99%, 
and most preferably from about 5 to about 95% of a solid or liquid adjuvant, and from 
about 0 to about 25%, preferably about 0,1 to about 25%, and most preferably from 
about 0.1 to about 20% of a surfactant. Whereas commercial products are preferably 
formulated as concentrates, the end user will normally employ dilute formulations. 

All of the features described herein may be combiaed with any of the above 
aspects, iu any combination. 

Embodiments of the invention will now be described by way of example, with 
reference to the accompanying drawhigs in which:- 

Figure 1 illustrates a multiple sequence ahgnment of amino acid sequences 
corresponding to froigal and bacterial 203 1 and OYE' family oxidoreductases; 

Figure 2 illustrates a multiple sequence alignment of nucleic acid sequences 
corresponding to fungal 203 1 and family oxidoreductases; 

Figure 3 A illustrates the expression of recombinant 2031 OR; B shows purified 
recombiaant2031 OR. 

Figure 4. Phylogenetic tree showing relationships between A. fumigatiis 203 1 
OR and similar proteins. This demonstrates a 2031 OR clade, which can be 
distinguished from the OYE proteins; 

Figure 5 illustrates reduction of a range of substrates by recombinant 203 1 OR. 

Figure 6 illustrates the inhibition of 2031 OR by two compounds identified from 
a screen. 
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EXAMPLES 

Example 1. Identification of an essential gene ixxAspersjUus fumis-atus 

5 An essential region of the A. jumigatus genome was identified using the mycobank 
technology as described in patent WO00177295A1 with the following modifications: 

Re-haploidisation (section 1.6): 

P24 lines 11-18: Conidia {A. fumigatus) were collected from a stable diploid 
10 transformant colony and approximately 3x10'* spores were used to inoculate 1 ml of 
SAB broth containing Img/ml FPA. This culture was incubated with shaking (200 
rpm) at 3TC for 20 hours. lOOjal of the culture was spread onto complete media 
containing 0.2 mg/ml FPA and incubated at 37 °C for 3 days or until rapidly 
growing sectors emerged. Conidia were collected from each sector and plated onto 
15 nitrate, nitrite and hypoxanthine media and the nitrogen utilisation profiles of the 
resulting conidia assessed. Colonies with the nitrogen utihsation profiles of the 
parental strains indicated breakdown of the diploid to a haploid. 44 haploid sectors 
were isolated from transformant 2031, None of the haploids isolated were hygromycin 
resistant indicating the iasertion of the hph gene into a portion of the genome required 
.20 for function. 

Transformation (section 1.7): 

P25 line 9: Plasmid pAN7-l linearised with HindHI was used as the transforming 
vector. PAN7-1 carries the hph gene which confers hygromycin resistance. 
25 P25 lines 17-20: 1 ml of cold YED was added to the cuvette and iacubated at 37 °C 
for 1 h. Aliquots were spread on selective agar (complete media with 250 |ag/ml 
hygromycin). Colonies growing on selective media were deemed putative 
transformants. 

3 0 The point of insertion was identified usiag the plasmid rescue method outlined on page 
31 lines 5-17. The iasertion site was confirmed by employing PGR: Using the 
sequence obtained from plasmid rescue data a primer was designed within the 
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sequence of pAN7-l and a complementary primer was designed within the predicted 
sequence near the point of insertion. Genomic DNA isolated from the diploid 2031 
was used as a template. 

The resulting DNA sequence (experiment 2031, with 175 bases of upstream pANT.l 
sequence removed) corresponds to the gDNA sequence immediately downstream of 
the insertion site and is given as SEQ ID No. 45. 

Example 2. Characterisation of the essential gene 
2.1 Genome analysis 

The TIGR A. fiimigatus database (www.TIGR.org) was searched (blastn) with the 
sequence SEQ ID No. 45, identified in Example 1 above, and a match to contig 4798 
(Eval 4.6e-148) was identified. The appropriate region of the contig sequence was 
down-loaded firom www.tigr.org and gene predictions carried out using Genscan 
(genes.mit.edu/GENSCAN.html; Settings; organism = vertebrate; Suboptimal exon 
cutoffs 1.00). 

The ab initio prediction of genes fi-om genomes is known to be an inaccurate 
process. (Burset, M. and Guigo, 1996, Genomics, 34, 353-367) and this is particularly 
so when the programs used have not been specifically trained for the genome under 
examination (as is the case here). It is therefore necessary to carefiilly examine the 
predictions, to compare any predicted genes with any homologous proteias, and to 
exploit the operative's knowledge of fungal gene structure, and thus to arrive at an 
informed prediction. The predicted genes were therefore compared with similar 
sequences using blastp (http:// blast.genome.ad.jp/), the multiple ahgnment program 
ClustalX (Thompson et al., 1997, Nucleic Acids Research, 24:4876-4882), and the 
alignment editor/ viewer Align (Hepperle, D., 2001: Multicolor Sequence Alignment 
Editor. Institute of Freshwater Ecology and Mand Fisheries, 16775 Stechlin, 
Gemiany). Gene structures were visualised and modified using Artemis 
(ht1p://www.sanger.ac.uk/Software/Arteniis/; Rutherford et al., 2000, Bioinfonnatics 
16, 944-945). 



The gene adjacent to the insertion site corresponded to bases 299-469 (exon 1) 
and bases 520-1618 (exon 2) of the genomic sequence given as SEQ IDNo. 1. The 
protein sequence for the gene is given as SEQ ID No. 3. The iQsertion site was 735 
bases upstream of the 5' ATG start of the gene. 

Searches of the protein databases at http://blast.genome.adjp/ showed that 
protein SEQ ID No. 3 is a member of the NADH-dependent flavin oxidoreductase 
family. This protein is henceforth referred to as 2031 oxidoreductase (2031 OR; 
having come jSrom mycobank experiment 2031). Other 2031 OR-like proteias were 
also identified (see Example 4.1). The NADH-dependent flavin oxidoreductase family 
also includes Old Yellow Enz>me (OYE), from S. cerevisiae and other fungi, although 
.203 1 ORs can be distinguished from OYEs. 

Refening to Figures 1, there is shown a multiple aUgnment of the 2031 OR 
amino acid sequence from A, fumigatus along with related ORs from other ftmgi and 
bacteria (see also Exam.ple 4). Regions 1-11 refer to amino acids conserved between 
ORs. 

Fungal 2031 ORs are given by: SEQ ID Nos. 3, 6 and 8, A. fumigatus; SEQ ID 
No. 10, AMidulans; SEQ ID Nos. 12 and 14, C albicans; SEQ ID Nos. 16 and 19, K 
crassa; SEQ ID Nos 22 and 44, M. grisea; SEQ ID No. 24, (NP_595868), pombe; 
SEQ ID No. 27, C. trifolii\ SEQ ED Nos. 30, 33 and 35, F. sporotrichioides; SEQ ID 
Nos. 38 and 83, F, graminearwnSEQ ID Nos. 40 and 42, M graminicola; SEQ ID No. 
85, U, maydis. 

Bacterial ORs resembling 2031 are: T44612 {Pseudomonas putida), SEQ ID No. 
86; NP_625402 (Streptomyces coelicolor), SEQ ID No. 87; NP_295913 (Deinococcus 
rqdiodurans\ SEQ ID No. 88; AF320254 (Azoarcus evansii, SEQ ID No. 89, 

Fungal ORs similar to the Old Yellow Enzyme family (originally identijaed iu S. 
cerevisiae): A fumigatus, Af4875 and Af4961, SEQ ID Nos. 90 and 91 respectively; C. 
albicans, Ca2460 and A36990, SEQ ID Nos. 92 and 93 respectively; N. crassa, 
Nc4452, SEQ ID No. 94; S. cerevisiae, OYEl, OYE2 and OYE3, SEQ ID Nos. 95-97 
respectively. 

Details of the sequence searches that identified the ORs other than SEQ ID No. 
3, and methods for the constmction .of multiple alignments are given ia Example 4 
hereinafter. 
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Referring to Figure 2, there is shown a multiple aUgnment of the nucleotide 
sequence of 203 1 OR from A. fumigatus along with related 203 1 ORs firom other fungi 
and bacteria (see also Example 4). Regions 1-11 refer to amino acids conserved 
between 2031 ORs at the amino acid level. Fungal 2031 ORs are given by SEQ ID 
5 No.: SEQ ID Nos. 1, 2, 4, 5, and 7, A fumigatus; SEQ ID No. 9, A.nidulans\ SEQ ID 
Nos. 11 and 13, C. albicans; SEQ ID Nos. 15, 17 and 18, K crassa; SEQ ID Nos. 20, 
21 and 43, M grisea; SEQ ID No. 23 (NP_595868), 5. pombe; SEQ ID Nos. 25 and 
26, C. trifolii; SEQ ID Nos. 28, 29, 31, 32 and 34, E. sporotrichioides; SEQ ID Nos. 
36, 37 and 82, F. graminearum; SEQ ID Nos, 39 aad 41, M graminicola; SEQ ID No. 
10 84, U, maydis. 

Details of the sequence searches that identijSed the ORs, and methods for the 
construction of multiple alignments are given in Example 4 hereinafter. 

15 2.2 Genomic Sequencing of Genes 

Following the above bioinformatic analyses, the genomic sequences of 2031 OR was 
experimentally determined, 

2. 2 J Bacterial and Fungal Strains 
20 For bacterial cloning, E. coli strains Top 10 (lavitrogen) aad select96 (Promega) were 

used in accordance with manufacturers' instmctions. 

A, fumigatus clinical isolate AF293 (re£ No. NCPF7367; available to the pubhc 

firom the NCPF repository; Bristol, U.K.); the CBS repository (Belgium) or firom Dr. 

David Denning' s clinical isolate culture collection, Hope Hospital, Salford, U.K.) is 
25 the preferred straia according to the present invention. AF293 was isolated in 1993 

firom the lung biopsy of a patient wilh invasive aspergillosis and aplastic anaemia. It 

was donated by Shrewsbury PHLS. 

2.2.2 Purification of A, fumigatus genomic DNA 
30 To obtain myceUal material for genomic DNA isolation^ approximately 10^ A. 
fumigatus conidia were iaoculated iu 50 ml of VogeFs minimal medium and iucubated 
with shakiag at 200 rpm until late exponential phase (18-24 h) at 3TC. MyceUum was 
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dried down onto Whatmann 54 paper using a Buckner funnel aad a side-arm flask 
attached to a vacuum pump and washed with PBS/Tween. At this point, the mycelium 
could be fireeze-dried for extraction at a later date. 

The mycelium (fresh or freeze dried) was grouad to a powder usmg hquid 
5 nitrogen in a -20°C cooled mortar. The ground biomass was transferred to 50 ml tubes 
on ice up to the 10 ml mark. An equal volume of extraction buffer (0.7 M NaCl; 0.1 M 
NaaSOs; 0.1 M Tris-HCl pH 7.5; 0.05 M EDTA; l%(w/v) SDS; pre-waxmed to 65°C) 
was then added to each tube, mixed thoroughly with a pipette tip and incubated at 65°C 
for 20 minutes in a water bath. A volume of chloroform/isoatnyl alcohol (24:1) 

10 equivalent to the volume of the original biomass was then added to each tube, tabes' 
were mixed thoroughly and incubated on ice for 30 min. Tubes were then centrifuged 
at 3,500 X g for 30 min aad the aqueous phase carefully transferred to fresh 50 ml 
tubes without disturbing the interface. 

An equal volume of chloroform/isoamyl alcohol (24:1) was added, the tubes 

15 vortexed and incubated on ice for 15 minutes. Tubes were then spun at 3,500 x g for 
15 minutes. After this spin, if large amounts of precipitate were, still present, the 
supematant was removed and the chlorofonn:isoamyl alcohol step repeated. The 
supernatant was removed and placed in clean sterile Oak Ridge tubes. An equal 
volume of isopropanol was added and mixed gently. Tubes were incubated at room 

20 temperature for at least 15 minutes. Tubes were then centrifuged at 3,030 x g for 10 
minutes at 4''C to pellet the DNA. The supematant was removed and the pellet allowed 
to air dry for 10-25 minutes. The pellet was suspended in 2 ml sterile water. 1 ml of 
7.5 M ammoniuin acetate was added, mixed and incubated on ice for 1 hour. Tubes 
were centrifuged at 12,000 x g for 30 min, the supematants transferred to a fresh tube 

25 and 0.54 volumes of isopropanol were added, mixed and incubated at room 
temperature for at least 15 minutes. Tubes were then centrifuged at 5,930 x g for 10 
miTi., the supematant was removed and the pellet washed in 1 ml of 70% ethanol. 
Tubes were centrifuged at 5,930 x g for 10 min and all the ethanol was removed. The 
pellet was air dried for 20-30* minutes at room temperature and suspended in 0.5-1,0 ml 

30 of TE (10 mM Tris-HCl pH 7.5; ImM EDTA) FinaUy, the DNA was treated with 
RNase A (5 p,l of Img/ml stock). 
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2.2.3 PCR Reactions ' 

Erimers were designed to the upstream and downstream regions of the A. fumigatus 
AF293 2031 OR; cloning primer pair SEQ ID Nos. 46 (Ox9_for) and 47 (OxlO_rev). 
. Tlie foUowing reagents and conditioris were .1^^ 



PGR Master Mix 

1 Ox liigh fidelity PCR buffer 5 |j1 

dNTP (clontech: lOmM) 1 jal 

nHaO 39 pil 

P& Ultra Pohnerase (2.5U/|il) 1 jal 

Forward primer (Ox9_for: 10 pmol/jj.1 stock) 1 )j,l 

Reverse primer (Oxl0_rev: 10 pmol/fxl stock) 1 ^il 

gDNA (1 :30 dilution of stock) 2 jxl 



PCR Cycle 

1) 95°C 2min 

2) 95° C 30 sec 

3) 54° C 30 sec 

4) 72° C 2 min 

5) 72° C 10 min 

6) 8°C Hold 

40 cycles of steps 2-4 were carried out and tihe PCR products were run on a gel. The 
product band (1.9kb) was excised firom tiae gel and purified using Qiagen's QIAquick 
Gel Extraction Kit (Qiagen Ltd, Boundary Court, Gatwick Road, Crawley, West 
Sussex, RHIO 9AX, UK) according to the manufacturers instructions and eluted into 
30 |il of sterile water (BDH molecular biology grade/filter sterile). 

2.2.4 Genomic DNA Cloning and Sequencing 

Since the gDNA was amplified using Pfu ultra polymerase which produces blunt ends 
it was necessary to add 'A' overhangs before Ugating in to pGEM Teasy. 12.5 yl of 
purified PCR product was incubated with 12.5 fxl 2x PCR Reddy Mix (ABGene) at 70° 



C for 30 minutes. The sample was then purified using Qigen Qiaquick gel extraction 
kit and eluted in 30 |j.l of molecular biology grade water. 

The PGR product was then hgated into pGEM-Teasy (Promega) using the 
following ligation mixture: 

5 ILLl 

1 |ll1 
3]al 
I fil 

The reaction was incubated over-night at 4° C. 

2 |Lil of the ligation mix were then added to Select 96 cells (Promega) and 
incubated for 20 mia on ice. Cells were then heat shocked at 42° C for 45 sees and 
placed back on ice. 250 iiil of room temp. SOC medium was then added and the cells 
incubated for 1 hour at 37° C, with shaking at 220 rpm. 50 and 200 fxl amounts were 
then plated on to LB agar plates contauiing ampicillin (100 iLXg/ml), 50 \xl X-gal (4%) • 
and 10 \il IPTG (100 mM) and incubated overnight at 37° C. 

Individual white colonies were picked firom each transformation inoculated into 
LB with ampicillin (100 |LLg/ml) and incubated over-night at 37° C, with shaking at 220 
rpm. Plasmid DNA was extracted using Qiagen miniprep kit according to the 
manufacturers instructions. 1 \xl of plasmid DNA was digested with EcoRI for 1 hour 
at 37° C. Fragment sizes were calculated to be 3Kb and 1.6Kb for gDNA and 3Kb and 
1.2 Kb for cDNA. Clones showing the correct restriction digest pattern were 
sequenced at MWG Biotech UK Ltd, Waterside House, Peartree Bridge, Milton 
Keynes, MK6 3BY. The experimentally determined sequence of 2031 OR was 
identical in the coding regions to that identified by bioinformatic analyses (Example 
2). 

Example 3. cDNA sequencing and RACE for 203 1 OR 

The internal sequence of the 2031 OR message was experimentally determined by 
cloning and sequencing cDNA, and the 5' and 3' ends of the gene were determined by 
RACE (Rapid Amplification of cDNA Ends). 



2x Buffer 
pGEM Teasy 
PGR product 
T4 DNA Ligase 
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3>1 cDNA cloiling and sequencing 
3.1.1 Preparation of A. furnigatus RNA and cDNA 
. Eungal cultures were prepared as described in Example 2.2.2. Cultures were harvested 
5 by jSltration, then washed twice with DEPC-treated water and transferred to a 50ml 
Falcon tube. Samples were jSrozen in Hquid nitrogen and stored at -80°C until required. 

To prepare RNA, fimgal samples were ground to a fine powder under hquid 
nitrogen. KNA was then extracted usiug the Qiagen RNeasy Plant Mini Kit following 
the protocol for isolation of total RNA from filamentous fungi in the RNeasy Mini 
10 Handbook (06/2001, Pages 75-78, http://www.qiagen.com/literature/ 

handbooks/ma/manTini/1016272HBRNY_062001WW.pd]0- The following 

modifications were used: At step 3^ RLC was used as the lysis buffer of choice; At step 
7, the Rneasy column was incubated for 5 min at room temperature after addition of 
RWl; The optional step 9a was carried out; At step 10, 30|xl RNase-firee water was 

1 5 added, the samples incubated foi; 10 min at room temperature, and then centrifiiged; At 
step 11, the elution step was repeated to give a total volume of 60 yl RNA. 

DNA contamination was removed fi:om the RNA by the addition of Dnase, using 
2 |Ltl DNase per fig RNA, in the presence of lOX DNase buffer and incubating at 3TC 
for 2h. DNase-treated RNA was cleaned up usiag the RNeasy Plant Mini Kit following 

2 0 the RNeasy Mini Protocol for RNA Cleanup (RNeasy Mmi Handbook 06/2001, pages 
79-81). 

To synthesise cDNA firom the above RNA the following reaction mixture was 
prepared: 100ng-l|ag of DNA-fi:ee RNA, 3|al oUgo (dT) (100 ng/^tl), and DEPC-treated 
water to a total volxmie of 42 |j,1. Samples were racubated in a heat block at 65^C for 5 
25 min after which they were allowed to cool slowly to room temperature. Then 2p,l 
Ultrapure dNTPs, l|il reverse transcriptase (Stratascript) and 5|j.l lOX reverse 
transcriptase reaction buffer (Stratascript) were added. Samples were incubated at 42°C 
for Ih, denatured at 90^C for 5 min and then cooled on ice. 

30 3 J. 2 Production of cDN A constructs 
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PGR was earned out using the cDNA above to generate cDNA fragments using the 
primer pair SEQ ID No. 48 (Oxl_for) and SEQ ID No. 49 (Ox3_rev). PGR reactions 
were carried out using the following reagents and conditions: 



PGR Master Mix 




lOx high fideUty PGR buffer 


5 yil 


dNTP (clontech: lOmM) 


Xlil 


MgSO4(50inM) 


2 ixl 


nHaO 


37.3ial 


Platinum TAQ Polmerase (5U/|J,1) 


0.2^1 


Forward primer (Oxl_for: 10 pmdl/{xl stock) 


Ifil 


Reverse primer (Ox3_rev: 10 pmol/|j.l stock) 




cDNA 


2 pi 



15 PGR Cycle 

1) 94° G 5 min 

2) 94°G 30 sec 

3) 53° G 30 sec 

4) 68°G 90 sec 
20 5)68°G 10 min 

6) 8° G Pause 

Gycles 2-4 were nm 40 times in total. The ampUcon was 1269 bp. The PGR products 
were pvtrified using Qiagen's QIAquick PGR Purification Kit (Qiagen Ltd, Boundary 
25 Gourt, Gatwick Road, Grawley, West Sussex, RHIO 9 AX, UK) according to the 
manufacturers instructions. The pvirified PGR products were examined on agarose gels. 

PGR products were Hgated into pGEM-Teasy, used to traasform. Select 96 cells, 
and sequenced as described in 2.2.4 above. The cDNA sequence obtained is given as 
bases 115-1385 of SEQ ID No. 2. 

30 

3.2 RAGE 
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To determine the 5' and 3 ' ends of the genes, RACE (Rapid Amphfication of cDNA 
Ends) was carried out, using the GeneRacer™ Klit (iavitrogen; cat No. LI 5,02-01), 
essentially as per manufacturers iostructions. 

5 3.2,1 Preparation of RNA 

A. fumigatus biomass was prepared as described in 2.2.2. RNA was prepared using the 
FastRNA kit (QBIOgene) following the manufacturer's instructions (Revision 6030- 
999-1 JOS) with the following amendments: At step 1 40 mg of biomass was used per 
extraction; At step 2, samples were processed for 20 seconds at speed 5, incubated on 
10 ice for 3 niinutes, and processed again for 20 seconds at speed 5; At step 3 samples 
-were centrifuged for 5 minutes; At step 5, 500 jiil DIPS were added, mixed, and 
incubated at room temperature for 2 minutes. Samples were mixed again and incubated 
for a further 2 minutes; At step 6 two washes in 250 |al SEWS were carried out; At step 
7, the pellet was disolved in 50 yl SAFE buffer. 

15 

3.2,2 RACE 

1 jLig total RNA prepared as described above was de-phosphorylated in a 10 jal reaction 
using 10 units of calf intestinal phosphate (CIP), 1 |il lOX CIP buffer and 40U 
RNaseOut™ (made up to 10 |li1 in DEPC w^ter) at 50°C for 1 hour. Samples were then 

2 0 made up to 100 \x\ with DEPC water and the RNA extracted with 100 \x\ (25:24:1) 
phenohchloroform: isoamyl alcohol. RNA was then precipitated by the addition of 2 |il 
mussel glycogen (lOmg/ml), 10 |Lil 3M sodium acetate, pH 5.2 and 220 jil 95% etiianol 
and the sample frozen on dry ice for 10 minutes. RNA was pelleted by centrifugation 
at 14,500 rpm for 20 minutes at 4°C, washed with 70% ethanol, air dried and re- 

2 5 suspended in 8 |j1 DEPC water. 

De-phosphorylated RNA (7 fxl) was de-capped in a 10 jil reaction with 0.5 U 
tobacco acid pyrophosphatase (TAP), 1 p,l lOx TAP buffer and 40U RnaseOut™ for 1 
hour at 37°C. RNA was extracted with phenohchlorofoim and precipitated as above, 
and then re-suspended in 7 jal DEPC-treated water. 

30 De-phosphorylated, de-capped RNA (7 was added to the pre-aliquoted 

GeneRacer™ RNA Oligo (0.25 |ig) and incubated at 65°C for 5 ndnutes. A 10 ^1 
ligation reaction was then set up by the addition of 1 jil lOx ligase buffer, 1 jiil lOinM 
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ATP, 40U RoaseOut™ and 5U T4 RNA ligase and incubated at 37°C for 1 hour. RNA 
was extracted and precipitated as described previously and re-suspended in 11 jjl 
DEPC-treated water. 

First-strand cDNA was prepared by the addition of 1 pi GeneRacer™ Oligo dT 
5 primer and 1 p,l dNTP mix (lOmM each) to 10 |xl ligated ENA and incubated at 65°C 
for 5 minutes. The following reagents were added to the 12 fjl ligated RNA and primer 
mix; 4 yd 5x first strand buffer, 2 ill O.IM DTT, 1 ^1 KNTaseOut™ and 1 nl 
Superscript™ 11 RT (200U/ij,1) and incubated first at 42°C for 50 minutes and then, to 
stop the reaction, at 70°C for 15 minutes. 2U RNase H was added to the reaction mix 

1 0 and incubated at 37°C for 20 minutes. 

To amplify the 5'cDNA ends a 50 p.1 PGR reaction was set up using 1 |J.l of the 
RACE-ready cDNA prepared above, 1 |j,1 GeneRacer™ 5' primer, 1 ixl reverse gene- 
specijBc primer (SEQ ID No. 50; Ox6race_rev: 5 pmol/|jl stock), 1 |al dNTP solution 
(lOmM each), 2 |al 50 mM MgS04, 5 pi High Fidelity PGR buffer, 0.5 ^il Platinum® 

15 Tag DNA Polymerase High Fidelity (5 U/|xl) and 38.5 |j,1 sterile water. Cycling 
parameters are given in Table n below. 

A second, nested PGR stage was then set up using 1 \xl of the RAGE cDNA from 
the first stage above, 1 p,! Nested 5' primer (suppHed with kit), 1 reverse gene- 
specific primer (SEQ ID No. 50; Ox6race_rev: 5 pmol/}j,l stock), 1 |j.1 dNTP solution 

20 (10 mM each), 2 jjl 50 mM MgS04, 5 jjJ High FideUty PGR buffer, 0.5 pi Platinum® 
Tag DNA Polymerase High Fidehty (5 U/)j,1) and 38.5 (j.1 sterile water. Gycliag 
parameters are given in Table n below. 

To amplify 3' ends a 50 jal PGR reaction was set up using 1 {il of the RAGE- 
ready cDNA prepared above, 1 p,l GeneRacer™ 3' primer (10 p-M), 1 pi forward gene- 

25 specific primer (SEQ ID No. 51; Ox7race_for: 5 pmol/p,! stock), 1 pi dNTP solution 
(10 mM each), 2 \il 50 mM MgS04, 5 pi High Fidelity PGR buffer, 0.5 ^xl Platinum® 
Tag DNA Polymerase High FideUty (5 U/pl) and 38.5 ixl sterile water. GycHng 
parameters are given in Table n below: 

A second, nested PGR stage was then set up using 1 pi of the 3' RAGE cDNA 

30 from the first stage above, 1 pi Nested 3' primer (supplied with kit), 1 pi reverse geiie- 
specific primer (SEQ ID No. 52; Ox8race_for: 5 pmol/pl stock), 1 p,l dNTP solution 
(lOmM each), 2 p,l 50 mM MgS04, 5 jxi High FideUty PGR buffer, 0.5 ill Platimm® 
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Tag-. UNA Polymerase High FideHty (5V/^l) and 38.5 |^1 sterile water. Cycliag 
pararaeteirs are given in Table II below. 



Table n. Cycling parameters for 5' and 3 'RACE 



5' and 3' RACE 


Nested PGR 


94 °C 


2min 


1 cycle 


94° C 


2inin 


1 cycle 


94 °C 


30s 


5 cycles 


94° C 


30 sec 


25 cycles 


72 C 


Imm 




67° C 


30 sec 










68° C 


1 min 




94 °C 


30s 


5 cycles - 








70 °C 


Imin 
















68° C 


10 min 


1 cycle 


94 °C 


30s 


* 25 cycles 


8°C 


Hold 




64 °C 


30s 










68 °C 


Imin 










68 °C 


lOmin 


1 cycle 








8°C 


Hold - 











5' and 3' RACE confimied the predicted 5' ATG and 3' stop codon as well as giving 
the 5' and 3' untranslated regions shown as bases 1-114 and 1385 - 1921 Of SEQ ID 
No. 2. The coding sequence for 2031 OR thus detenniaed was identical to that given 
10 as bases 299-469 and 520-1618 of the gDNA gien as SEQ ID No. 1. 

Example 4. Identification of other fungal 2031 ORs and related genes 



Homologs of A. fumigatus 203 1 OR were identified in other fungi and bacteria by 
means of bibinformatics analysis. Sequences identified by bioinfoxmatics can be used 
to design primers which in turn can be used in PGR to generate DNA coding for the 
2031 ORhomolog. 

Alternatively, degenerate PGR can be used to obtain sequence for novel genes, 
which can then be used to generate probes for screeiimg cDKA or genomic hbraries of 
the organism of interest to identify clones contahmg the 2031 OR homolog. As a 
further alternative. Southern blots using fragments of genes fi*om one species as 

probes cau be used to identify the presence of a homolog in the genome of a second 
species. The same probe can then be used to screen cDNA or genomic DNA libraries. 
Once clones corresponding to the novel genes have been identified they can be 
expressed for functional characterisation of the protein. 

4.1 Identification of homologs bv bioinfoxmatics 

Analysis of the 2031 OR protein sequence with PFAM 
(http://www.sanger.ac,uk/Software/Pfam/) identified this as a member of the Oxidored 
FMN family (PF00724), E-value 3.6e-57. This includes the well-characterised "Old 
Yellow Enzyme" proteins of S. cerevisiae and other fungi. 

Homologs of A. fumigatus 2031 OR sequence were identified by database 
searches (see Table IH). Where necessary, matching contigs were down-loaded and 
genes predicted from genomic DNA by Genscan analysis, blast searches, alignment 
and visualisation with Artemis as described in Example 2. Protein and nucleotide 
multiple alignments were generated for 2031 OR and related genes (Figures 1 and 2). 

Protein and nucleic acid multiple ahgnments are generated by means of 
programs such as ClustalX (Thompson et al., 1994, Nucleic Acids Research, 22, 4673- 
4680; Thompson et al., 1997, Nucleic Acids Research, 24, 4876-4882;) and/or using 
manual alignment editors such as Align (Hepperle, D., 2001: Multicolor Sequence 
Alignment Editor. Institute of Freshwater Ecology and Inland Fisheries, 16775 
Stechhn, Germany). 

Table III: 2031 homologs identified by database searches 
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Contig/EST/ 


E- 


SEQBDNo. - 


Species (details of search 


predicted 
gene 


value^ 


EST/gDNA 


CDNA^ 


Protein 


given in footnotes) 


4929 


6.6e-81 


4 


5 


6 


Aspergillus fumigatus^ 


4951 


.l.le-68 


7 


- 


8 


Aspergillus jundgatusr' 


4875 


5.7e-13 


- 


- 


- 


Aspergillus fumigatus^ 


4961 


3.2e-10 


- 


- 


- 


Aspergillus fumigatus^ 


1.112. 


3e-33 


9 


- 


10 


Aspergillus nidulans'^ 


6-2431 


2.6e-77 


11 


- 


12 


Candida albicans^ 


6-2464 


5.9e-50 


13 


- 


14 


Candida albicans^ 


6-2460 


5.8e-19 


- 


- 


- 


Candida albicans^ 


A36990 


le-15 








Candida albicans^ 


NCU07452.1 


7e-94 


15 


- 


16 


Neiirospora crassa' 


NCU08900.1 


2e-19 


17 


18 


19 


Neurospora crassa' 


NCU04452.1 


2e-23 


- 


- 


- 


Neiirospora crassa' 


MG04569.3 


le-106 


20 


21 


22 


Magnaporthe grised^ 


MG03823.3 


8e-19 


43 




44 


Magnaporthe grisea^ 


NP_595868 


le-05 


23 


- 


24 


Schizosaccharomyces 
pombe' 


OYEl 


le-15 


- 


- 


- 


Saccharomyces cerevisiae^ 


OYE2 


4.5e-19 








Saccharomyces cerevisiae^ 


OYE3 


l.Oe-16 


- 


- 


- 


Saccharomyces cerevisiae^ 


FsCon[0063] 
(EST contig) 


le-82 . 


28 


29 


30 


Fusarium 
sporotrichioides^^ 


Gzl5771741 


5e-76 


36 


57 


38 


Fusarium graminearum'" 

0 


Mg[0281] 
(EST contig) 


2e-67 


39 




40 


Mycosphaerella 
graminicola^'^ 


CtCon[0249] 
(EST contig) 


le-55 


25 


26 


27 


Colletotrichium trifolii^^ 


FsCon[0458] 


le-42 


34 




35 


Fusarium 



41 



(TEST contig) 










sporotrichioides^^ 


FsCon[0237] 
(EST contig) 


le-40 


31 


32 


33 


Fusarium 
sporotrichioides^^ 


Mga0328f 


3e-35 


41 




42 


Mycosphaerella 
graminicola^^ 


T44612 


le-52 


- 


- 


- 


Pseudomonas putida'^ 


NP_625402 


le-79 








Streptomyces coelicolor^^ 


NP_295913 


le-78 








— — -j-J — 

Deinococcus radiodurans 


AF320254 


5e-55 








Deinococcus radiodurans^^ 


FG00074.1 




82 


82 


83 


Fusarium gramineamm^^ 


Contig 1.2 


le-71 


84 


84 


85 


Ustilago maydis^^ 



^E-values for blast scores refer to searches with 2031 OR protein unlesss pecified 
otherwise in footnotes. 

cDNA was generated in cases where either the gene contains multiple exons, or 
5 there are probable frame-shift errors from sequencing of the EST, or the EST given is 
the non-coding strand. 

^Search of the A. fumigatus genome at http://www.TIGR.org (tblastn) with 
NP_595868. 

^Search of ^. nidulans genome held on local machine (tblastn). 
10 ^Search of the C albicans genome at http://www- 
sequence.stanford.edu/group/caadida/ (blastp). 

^Search of the non-redundant protein sequence database (nr) at 
http:/^last.genome.ad.jp (blastp). 

"^Search of the N. crassa predicted proteins at 
1 5 http://www.broad.mit.edu/annotation^fimgi/neurospora/ (blastp). 

^Search of the M grisea predicted proteins at 

http://www.broad.mit.ed-a/aiinotation/fimgi/magnaporthe/ (blastp). 

^Search of S. cerevisiae , orf proteins (http://mips.gs£de/cgi- 

bin/blast/blast_page?genus==yeast) 
20 ^^Search of COGEME pathogenic fimgal EST database at 

http://cogeme.ex.ac.uk/blast.html (tblastn, max E-val=0.1). 
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^^Search of NCBI non-redtindaiit protein database on local machine with SEQ ID No. 
1 (blastx). Only aselected set of hits against bacterial proteins are shown. 

12 

Search of F. graminearum predicted proteins held on local machine (blastp). 
Search of I/, maydis contigs held on local machme (tblastn) 

5 

To clarify the relationships between Ihe 2031 OR, OYE and the hits ideatified from 
blast searches, phylogenetic analysis was carried out. The PHYLIP suite of programs 
was used (Felsenstein, Felsenstein, J., 2002. PHYLIP (Phylogeny hiference Package) 
version 3.6a3. Distributed by the author. Department of Genome Sciences, University 
10 of Washington, Seattle). The multiple alignment used for the analyses was essentially 
that given in Figure 1 with partial sequences, gapped regions and xmreHably ahgned 
sections excluded. A distance matrix was generated using PROTDIST with the Jones- 
Taylor-Thomton model and the tree inferred using FITCH with global rearrangements 
and 10 jumbles of input order. 100 bootstrap rephcates were generated using 
15 SEQBOOT, distance matrices generated using PROTDIST as above, trees inferred 
using NEIGHBOUR^ and then bootstrap values and the consensus tree were calculated 
using CONSENSE. Trees were viewed using TREEVIEW (Page, 1996 Page, R. D. M., 
1996. TREEVIEW: An application to display phylogenetic trees on personal 
computers. Computer Applications in the Biosciences 12, 357-358.) 
20 Phylogenetic analysis identified a clade supported by good bootstrap values, 

which ir\z\^xAQ^i A. fumigatiis 2031 OR aad other enzymes. This could be distinguished 
from a clade containing OYE enzymes which was also supported by good bootstrap 
values. Bacterial homologs of both 2031 OR and OYE (not shown,) were also 
identified. We have therefore identified a set of 2031 OR homologs which, 
25 surprisingly, is distinct firom the weU-characterised OYE family, and which, by virtue 
of the essentiahty demonstrated for A. fumigatus 2031 OR, represents a set of 
potential targets for anti-fungal dmgs 
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4.2 Identification of homologs bv degenerate PGR 

4,2. L Preparation of genomic DNA from organism of interest 
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Fungal cultures are prepared using methods suitable for particular species. For 
. example, Aspergillus and Candida species, Cryptococcus neoformans, Fusarium 
solani and Trichophyton species are maintained on Sabouraud dextrose agar at 30- 
35°C; Leptosphaeria nodorum on Malt agar medium (30 g/L malt extract; 15 g/L 
5 Bacto-agar, pH 5.5), 24.0°C; Magnaporthe grisea on Oatmeal agar (6.1 g/L agar, 53.3 
g/L instant oatmeal) 25.0°C, or Commeal agar (Difco 0386), 26.0 C; Phytophthora 
capsici cultures were maintained on on V-8 agar at 24°C; Pyricularia oryzae cultures 
were maintained on rice polish agar at 24°C under white fluorescent lights (12hr 
artificial day), and were subcultured every 7 - 14 days by the transfer of mycelial plugs 

10 to fresh plates; Pythium ultimum cultures were maintained on PDA at 24°C, and 
subcultured every 7 days by the transfer of aerial mycelium to fresh plates with an 
inoculating needle; Rhizoctonia solani cultures were maintained on PDA at 24°C imder 
fluorescent lights (12 h artificial day), and subcultured every 7 days by the transfer of 
mycehal plugs to fresh plates; Ustilago maydis cultures were maintained on PDY agar 

15 at 30°C in the dark, and subcultured by re-streaking. 

Genoroic DNA was prepared from cultures using standard methodologies, e.g. 
using the Qiagen DNeasy Plant Kit, or using methods described in Example 2.2. 

4,2.2 PCR 

20 Primers (SEQ ID Nos. 53 and 54) were designed on the 2031 OR-specific regions 
given as regions 2 and 6 in Figure 2. However, those skilled in the art will appreciate 
that it may be necessary to try altemative primers. PCR reactions using the above 
primer pair are set up as follows: 

25 12.5 |il 2x ReddyMix PCR mastermix (ABIgene) 
1 |Lil primer SEQ ID No. 53 (5 pmol) 
1 1x1 primer SEQ ID No. 54 (5 pmol) 
template gDNA (1.5-4 |xg/ml) 
nuclease-free water to give a final volume of 25 ]xl 
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The reactions are run using the following conditions on a Biometra personal PGR 
cycler (Thistle Scientific Ltd, DFDS House, Goldie Road, Uddington, Glasgow, G71 
6NZ):- 



5 Stepl 


95°C 


Smin 


Step2 


95°C 


Imin 


Steps 


53°C 


Imin SOsec 


Step4 


68°C 


2iiiin SOsec 


Steps 


72°C 


lOmiTi 


10 Step6 ■. 


4°C 


Hold 



SO cycles of steps 2-4 are carried out. The PGR products are purified (to remove 
residual enzymes and nucleotides) using Qiagen's QIAquick PGR Purification Kit 
(Qiagen Ltd, Boundary Court, Gatwick Road, Crawley, West Sussex, RHl 0 9 AX, UK) 
1 5 according to the manufacturers instructions and eluted into 40p,l of sterile water (BDH 
molecular biology grade/filter sterile). The purified PGR products are examined on 1% 
agarose gels. 

Those skilled in the art will appreciate that degenerate PGR may require 
variations in a number of parameters in the attempts to generate a product. These 
20 include primer concentration, template concentration, concentration of Mg^^ ions, 
elongation and annealing times, and annealing temperature. Variations in temperature 
can be accomodated by the use of a gradient PGR machine. 

The purified PGR products are cloned into pPEM-Teasy (Promega) and then 
transformed into XLIO-Gold® Kan ultracompetent E. coli ceUs according to the 

2 5 manufacturer's instructions. The tiransformation reactions are then plated onto LB agar 

plates contaming ampiciUin (100 |ag/ml), 50 |il X-gal (4%) and 10 \il IPTG (100 mM). 
Following overnight incubation at 37°G, individual white colonies from each 
transformation are sub-cultured into LB broth containing ampicillin (100 jj.g/ml). After 
overnight incubation at S7°G wifb shaking, plasmids are extiracted using Qiagen spin 

3 0 mini plasmid extraction kits according to the manufacturers instructions and sent away 

for fiiU-length sequencing. 
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4.3 Identification of homologs bv Southern Blotting 

43.1 Digestion of genomic DNA and transfer to nylon membranes 
5 Genomic DNA from the fungi of interest are digested with the appropriate restriction 
enzyme and run on 0,8 % agarose gel. The gel is then submerged in 250 mM HCl for 
no more than 10 mins, with shaking, at room temperature, after which the gel is rinsed 
with sterilised RO water. 

Transfer of the DNA onto nylon membrane is carried out using 0.4 M NaOH. 
10 Transfer protocols and apparatus are well known and are described in e.g. Sambrook et 
al., (1989), Molecular Cloning, 2'''^ Edition., Cold Spring Harbor Laboratory Press. 
After transfer, the DNA is fixed to the membrane by baking at 120^C for 30 min. The 
membrane can then be used immediately, or stored dry for future use. 

15 4.3. 2. Preparation of probe 

Probes are generated either by restriction digests of DNA or by PGR of an appropriate 
region. A suitable probe can be generated by PGR using the primer pair SEQ ID Nos. 
53 and 54, A.fumigatus genomic DNA, and the methods give in 4.2.2. 

1 p,g DNA template is diluted in molecular biology water to a total volume of 16 

20 jal, denatured in a boiling water bath for 10 mins, and quickly chilled on ice. 4 |il DIG- 
High Prime (1 mM dATP, ImM dCTP, ImM dGTP, 0.65 mM dTTP, 0.35 mM alkali- 
labile-digoxygenin-ll-dUTP, 1 U/}al labelling grade Klenow enzyme, 5 x reaction 
buffer, in 50% (v/v) glycerol) is then added and the reaction incubated at 3TC for 20 
hours, after which 2 jil of 200 mM EDTA pH 8.0 is added to terminate the labelling 

25 reaction. The labelling efficiency is estimated by comparison with DIG-labelled 
control DNA. 

4.3.3.Prehybridisation and Hybridisation 

The membrane is placed in a hybridisation tube containing 20 ml of prehybridisation 
30 solution (DIG Easy Hyb, Roche) per lOOcm^ of membrane surface area and 
prehybridised at 42''C for 2 hours in a hybridisation oven. The DIG- labelled probe is 
denatured by heating in a boiling water bath for 10 min and then chilled directly on 
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25 



30 



, ice. The probe is then diluted to -200 ng/mL in hybridisation solution (Easy Hyb, 
Roche; at least 5 mL of hybridisation solution is required per hybiidisatibh). The 
. prehybridisation solution is discarded &om the hybridization tube and the hybridisation 
solution containing the DIG-labeUed probe added quicMy. The hybridisation then 
5 proceeds overnight at a 42°C in the hybridisation oven. The optimum temperature is 
dependant on probe size and homology with target sequence and was determined 
empirically. 

After hybridisation, the membrane is washed twice at 42°C, 5 miTis per wash, 
with 50 mL of stringency wash solution (3 x SSC, 0.1% SDS; where 20 x SSC buffer 
10 is 3 M NaCL, 300mM sodium citrate, pH 7.0), followed by two washes at RT, 15 min 
per wash, in 50 mL stringency wash solution. The stringency of these washes can be 
decreased by increasing the SSC concentration to 6 x SSC, 0.1% SDS and/or 
decreasing the wash temperatures. 

15 4.3.4. Detection 

The membrane is washed in 20 mL washing buffer (lOOmM Maleic acid, 150 mM 
NaCl; pH 7.5; 0.3% v/v Tween 20), and then incubated successively with the 
following; 20 mL blocking solution (1 % w/v blocking reagent for nucleic acid 
hybridisation, Roche, dissolved in lOOmM maleic acid, 150 mM NaCl, pH 7), for 30 
min at room temperature; Anti-DIG-alkaline phosphatase (Roche) diluted 1:5,000 in 
blocking buffer, 30 min at room temperature; Washing buffer, two washes each of 15 
min at room temperature; Detection buffer (100 mM Tris-Hcl, 100 mM NaCl; pH 9.5), 
2 min at room temperature. The membrane is then removed, placed on top of an 
acetate sheet, and ~ 0.5 ml (per lOOcm^) of CSPD or CDP-star added to the top of the 
membrane. A second sheet of acetate is then placed over the surface of the membrane, 
the assembly incubated for 5 min at room temperature and then sealed in a plastic bag. 
The assembly is then exposed to X-ray jghn for between 15 min and 1 hour. Optimal 
exposure time is determined empiricaUy by increasing exposure time up to 24 hours. 

The presence of a band on the gel is evidence of a gene in the genomic DNA of 
interest. The molecular weight of the band depends on the size of the restriction 
fragment that contains the gene. 
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Example 5. ExtJression during infection of wax moth larvae (Galleria melonelld) and 
mice with ^, fumis-atus 

5 5. 1 Preparation of cDNA from infected wax-moth larvae 

Wax moth larvae have been shown to be good model systems in which to ^study 
Candida infection (Cotter et al., 2000, FEMS Immunol Med Microbiol 27, 163-9; 
Brennan et aL, 2002, FEMS Immunol Med Microbiol 34, 153-7). We have found that 
this insect system is also a good system in which to study Aspergillus infection (D. 
1 0 Law and J. Rooke, manuscript in preparation). ' ^ . 

5,1,1 Gj^owth and infection of wax-moth lai^ae 

Spores of ^. fumigatus (AF293), grown on Sabaraud Dextrose agar, were harvested 
and re-suspended ia PBS/Tweeri 80. Spores were washed and the concentration 
15 adjusted such that a 10 fil iaoculum will cause death in 90% of the test group 3-4 days 
after infection (for AF293 this is 5.0-7.0x10^ cfu/ml). Inoculum concentration was 
estimated usiag an unproved Neubauer haemocytometer counting chamber and 
confirmed by TVC enumeration. 

Wax moth larvae were purchased from Livefood UK, Somerset, UK 
20 (www.livefood.co.uk), and were maintained in the dark at room temperature in wood 
shavings prior to infection. Healthy larvae (250 mg +/- 50 mg) were selected and 
incubated at 4°C for 10 minutes immediately prior to infection to immobilise them. 
Larvae were then injected through the cuticle of the left last pro-leg with 10 jllI spore 
suspension (lOOx stock), using a sterile Hamilton syringe. Larvae were then 
25 transferred to a sterile Petri dish. The followiag controls were also estabhshed: Larvae 
injected with 10 |li1 PBS/Tween only; larvae injected with 10 jitl heat killed spores 
(kiUed by incubation for 20 min 100°C); larvae pierced but not injected; and imtouched 
larvae. Larvae were incubated at 30°C and monitored at least twice daily. All 
treatments and controls were carried out on batches of 10 lan^ae. Larval deaths and 
3 0 general health condition was recorded every 24 hrs and dead or moribimd larvae were 
removed from the test group. 
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5,1,2 Preparation of DNA-free UNA from Aspergillus fumigatus-infected wax moth 
larvae (Gallefia melonella), 

cDNA was prepared from the following sources: Uiunfected larvae; larvae after 481i 
;iiLfectioii with A, ftimigatus (early infection); larvae after 72h infection with A, 
fumigatus (late infection); larvae infected with heat-killed A, fumigatus spores; and A, 
fumigatus grown in Sabaraud Dextrose agar brofli for 16hr. 

Frozen larvae were ground to a fine powder under liquid nitrogen in a mortar and 
pestle previously baked at 22°C overmght, treated with RNaseZAP, rinsed with DEPC- 
treated water (0,1% (v/v) DEPC, stirred for Ih and autoclaved for Ih) and cooled with 
liquid nitrogen. Ground sample was transferred to Eppendorf tubes (no more than 50 
mg per tube) and total RNA extracted using the Qiagen RNeasy Plant Mini Klit 
following the protocol for isolation of total RNA from filamentous fimgi in the 
RNeasy Mini Handbook (06/2001, Pages. 75-78, 

http ://www.qiagen. com/literature/handbooks/ 
ma/mamini/1 0 1 6272HBRNY_06200 1 WW.pdf), 

The following modifications were used: At step 3, 600 [il RLT was added to 
each 50 mg tissue and vortexed; At step 4, samples were centrifiiged for 3 min at 
maximxmi speed; At step 6, all samples from the same tissues were applied to the same 
RNeasy column; At step 7, RNeasy column was incubated for 5 min at room 
temperature after addition of RWl; Optional step 9a was carried out twice; At step 10, 
30 jil RNase-free water was added, samples incubated for 10 min at room temperature, 
and then centrifuged for 1 min at 14,000 RPM; At step 11, the elution step was 
repeated to give a total volume of 60 jal RNA. A sample of the RNA was run on a 
1.5% agarose gel and the amount of RNA quantified using the molecular marker. RNA 
was then stored at -80°C. 

A portion of the RNA was Dnase treated using 2 yl RNase-free DNase 
(Promega) per fig RNA, in the presence of lOX DNase buffer (Promega) at 37°C for 
4h. The RNA was then cleaned up using the Qiagen RNeasy Plant Mini Kit following 
the RNeasy Mini Protocol for RNA Cleanup (RNeasy Mini Handbook 06/2001, pages 
79-81), but including a further DNase treatment step during clean-up as in the Rneasy 
handbook. 



The following modifications were made: Optional step 5a was carried out; At 
step 6, 30p,l KNTase-free water was added, samples incubated for 10 rain at room 
temperature and thea cenfrifuged for 1 min at 14,000 EPM; At step 7, the eluate from 
step 6 was transferred onto tlie RNeasy column, incubated for 10 min at room 
temperature, and then centrifuged for 1 min at 14,000 RPM. A sample of the DNase- 
treated RNA was run on an agarose gel, quantified and stored at -80"C. 

5,L3 Checking RNA samples for DNA coittamination 

To verify the absence of genomic DNA from the RNA samples, PGR was carried out 
using primers that amplify the P-tubulin gene (SEQ ID Nos. 77 and 78). In Ihe absence 
of a reverse-transcription step, only gDNA will be detected and thus any gDNA 
contamination will be revealed. The following reaction mixture was set up: 

12.5 yl 2x ReddyMix PGR mastermix (ABIgene) 

1 jul each primer (5 pmol) 

template gDNA (1 .5-4 \ig /ml) 

nuclease-free water to give a final volume of 25 |li1 

The reactions were run using the following conditions on a Biometra personal PGR 
cycler (Thistle Scientific Ltd, DFDS House, Goldie Road, Uddington, Glasgow, G71 
6NZ):-. 



Stepl 


95°C 


Sroin 


Step2 


90°C 


linin 


Step3 


51°C 


Imin 


Step4 


68°C 


Imin 


Steps 


68°C 


lOmin 


Step6 


4°C 


Hold 



40 cycles steps 2-4 

If a PGR product was observed, genomic DNA was present and the sample was 
DNase-treated again. If the PGR was negative, no DNA was present in the sample. 

5. L 4 Preparation of cDNA 



5D 



, ■ ■ 300 ^ig DNA-free.RKA and 3 jllI oligo (dT) (100 ng/iLd) were added to an RNase-free 
, 0.5 nil roicrocentiifuge tube, and made up a total v^^ 

water. Samples were mixed and incubated in a beat block at SS^'C for 5 min and then 
\V slowly cooled to room temperatxire. 2 ill Ulti^mp dlS^ (10 mM ea6h, Clon^ 1 
5 [il stratascript reverse transcriptase (Stratagene) and , 5 pi 1 OX reverse transcriptase 
. , reaction buffer were then added. The samples were incubated at 42°C for lli, denatured 
at. 90*^C for 5 min and then.cooled on ice. Samples were dispensed in 5-10 |j1 aliquots 
and stored at -20*^0. \ 

10 5.2. Preparation of cDNA from infected mice 

5. LI Infection of mice with A. fumigatus and extraction of tissues. 
Mice were infected with Aspergillus fumigatus and organs harvested as follows. 
Thirteen male CDl mice were injected with the immunosuppressant 
cyclophosphamide (0.025 g/ml; 200 mg/kg) IV via the tail vein. After 72 hours, twelve 

15 mice were injected with 0.15 ml Aspergillus fumigatus AF293 conidia (7.5 x 10^/ml). 
11 hours after infection, four mice were sacrificed with an overdose of inhaled 
halothane. The brain, lungs, liver and kidney were removed, frozen by inmiersion ia 
Hquid nitrogen, and stored at -70^C. A ftirther four mice were also sacrificed at 24 and 
48 hours after infection. 
20 RNA was prepared from mouse tissues as described for wax moth larvae above 

(5.1.2 aad 5.1.3). 

5,2.2 Preparation of cDNA from DNA-free RNA. 

cDNA was prepared from DNA-free RNA using the Promega Reverse Transcription 
25 kit, following the protocol as supplied with the product (Technical Bulletin No. 099, 
http://www.promega.com /tbs/tb099/tb099.pdf). hi a modification to the protocol, the 
cDNA synthesis reaction was incubated for 60 min at AT'C rather than for the 
suggested 15 min. Samples were stored in 5-10|al ahquots at -20°C. , 

30 5.3 Design and optimisation of primers 

Primers were designed against the 2031 OR cDNA sequence using Beacon Designer 
2,1 (Premier Biosofl, http ://www.premierbiosoft . com^ with the following parameters; 
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Target Tm = 58 ± 8°C; Length of primers = 16-24; Amplicon length = 75-150 bp. All 
other settings were defatat Care was taken to choose primers that would not fdim 
dimers or other secondary structures. Secondary structures of amplicons were 
calculated using mfold 

5 rhttp ://www.bioiiifo.rpi -edn/applications/mfold /old/daa/foirnil .cei) and primer sets 
giving an amplicon with little or no secondary structure were chosen. The resulting 
primers are given as SEQ.ID Nos. 79 and 80. 
■ To determine optimum annealing temp for the primer set, a gradient PGR was 
run on an Icycler PGR machme (Biorad), rising A. fumigatus AP293 genomic DNA as 
10 a template and the following reaction mixture: 

1 12.5 til Abgene PGR Reddymix 
9 M-1 SEQ ID No. 79; OXRED 2031F6 (5 pm/|al) 
9 fj,l SEQ ID No. 80; OXRED 203 1R5 (5 pm/|il) 
15 85.5ialH20 

9 vl AF293 gDNA (1 0 ng/ul) 

For the negative control, the gDNA was omitted and the amonnt of water increased 
correspondingly. 



20. 



30 



For each mix, 25 |j1 was pipetted into 8 wells on a multiwell plate, and each well run at 
a different temp (between 50 and 65°G) with the following conditions: 



Stepl. 95°C-5imn • 
25 Step2. 95°C - 1 min 

Step3. Gradient 50-65°G - 1.5 min 
Step4. 72°G - 1 min 
Step5. 72°G-10min 
Step6. 8°G - hold 



Steps 2-4 were run for 30 cycles 
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The PGR products were run on a 2% agarose gel. A single band of the correct size of 
. 148 bp was seen on the gel for all the temperatures, and the optimum was found to be 

.•5-., ...... 

6.4 Testing species-specificity of p rimers 

The real-time primers designed above were further tested to ensure tha:t mouse nucleic 
acid was not amplified iising these primers. Pour reactions were set up, each 
containing the following: 
io '■ , 

12.5 |j,l Abgene Reddymix 
1 fxl primer SEQ ID No. 79 

1 III primer SEQ ID No. 80 • 
9.5 fil H20 

15 and either; 1 jal infected mouse kidney cDNA (50 ng)al; experimental); 1 fxl uninfected 
mouse kidney cDNA (50 ng/^l; uninfected control); 1 ^1 AP293 gDNA (10 ng/jjj; 
positive control); 1 jxl water (negative control). 

The following PGR settings were used: 
20 Stepl 95°C-5 min 
Step2 95°C-l min 
Step3 63°C-1.5min 
Step4 72°G - 1 min 
Step5 72° G- lOmin 

2 5 Step6 8°G-hold 

Steps 2-4 were run 40 times 

The PGR products were run on a 2?^ "agarose gel. A. fumigatus genomic DNA gave a 
band of 148 bp, the expected size, but no bands were seen in uninfected or infected 

3 0 mouse cDNA. These primers therefore appeared to be specific. 
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5.5 Real-time PCR to detect expression in infected larvae 

PGR reactions were set up using the Biorad iQ SYBR green supermix as follows: 

14 fj.1 Primer SEQ ID No. 79 
14 |al Primer SEQ ID No. 80 
175 (xl SYBR mix 
ISS.H-lHaO 

Four reactions were set up containing 72 jal of the above mix and either; 3 (j.1 H2O; 3 pj 
uninfected larvae cDNA (50 ng/jxl); 3 |J,1 AF293 gDNA (5 ng/)j,l); or 3 \xl infected 
larvae cDNA (50 ng/|jl) were added. 3x25 yl aliquots of each reaction were aliquoted 
into an Abgene multiwell plate, the plate sealed with optical sealing tape (Biorad), then 
placed in a Biorad Icycler real-time PCR machine. Reactions were run with the 
following conditions : 

Stepl. 95.0°C 3min 

Step2. 95.0°C 30 sec 

Step3. 63.0°C 30 sec 

Data collection and real-time analysis enabled. 
Step4. 72.0°C 15 sec 

60 cycles of steps 2-4. 

Step5. 95.0''C 30 sec 

Step6. 50.0°C 30 sec 

Step7. 50.0°C 10 sec 

90 cycles of step 7 with setpoint temperature increased by 0.5°C after each cycle 
starting with cycle 2. Melt curve data collection and analysis enabled. 

Results are shown in Tables IV and V. Expression of 2031 OR was demonstrated ia 
both Af293 cDNA (Ct = 25.8) and in infected larvae (Ct = 32.3). Therefore, the 
message is expressed both in A. fiimigatus cultures and in A. fumigatus jfrom infected 




larvae. The negative and iininfected larvae controls give only primer dimers and non- 
specific products. 



5 Table IV. PGR Quantification Spreadsheet Data for SYBR-490 



TIT— 11 

Well 


Identifier 


Ct 


COS 


infected larvae (5 Ong) 




C09 


infected larvae (50ng) 


32 A 


ClU 


infected larvae (5 Ong) 




DOS 


Negative 


51.3 


D04 


Negative 


N/A 


DOS 


Negative 


55.6 


H03 


xminfected larvae 


36.4 


H04 


iminfected larvae 


N/A 


H05 


uninfected larvae 


N/A 


H08 


A, Jumigatus gDNA (5ng) 


25.8 


H09 


Afumigatus gDNA (5ng) 


26 


HIO 


A.fumigdtus gDNA (5ng) 


25.8 



Data Analysis Parameters: Calculated threshold was replaced by the user selected 
threshold 7.4.; User selected basehne cycles were 2 to 10. 

10 

Table V. Melt Curve Analysis Spreadsheet Data for SYBR-490 



Well 


Well Identifier 


Peak ID 


Melt Temp 


C8 


infected larvae (50ng) 


C8.1 


88.5 


C9 


infected larvae (50ng) 


C9.1 


88.5 


CIO 


infected larvae (5 Ong) 


ClO.l 


88.5 


D3 


Negative 


D3.1 


78 


D5 


Negative 


D5.1 


81.5 






D5.2 


77.5 
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H3 


Tminfected larvae 


H3.1 


81.0 


H5 


uiuiifected larvae 


H5.1 


78.0 


H8 


A fianigatus gDNA (5ng) 


H8.1 


89.0 


H9 


A. fianigatus gDNA (Sng) 


H9.1 


89.0 


HIO 


A. fianigatus gDNA (5ng) 


HlO.l 


89.0 



Melt Curve Analysis Parameters; Threshold for automatic peak detection was set at 
2.64. 



5 ■■ ' ^ 

5.6 Real-tiipe to detect e3CDression in infected mouse kidnev cDNA. 

E.eal-time experiments similar to those described ia 5.5 using 1 n,l of infected mouse 
cDNA showed no amplification (data not shown). The experiment was therefore 
10 carried out using an increased amount of infected mouse cDNA with the following 
conditions: 

18 pi Primer SEQ ID No. 79 
1 8 nl Primer SEQ ID No. 80 
15 225 |J.lSYBRmix 
99plH20 

Four reactions were set up containing 60 |j,1 of the above mix and either; \5 \A H2O; 3 
Hl uninfected mouse kidney (50 ng/|al) + 12 pil H2O; 15 \i\ infected mouse kidney - 
2 0 48h post-infection (50ng/ |j1);. or 3 |al AF293 cDNA (5ng/|al) + 12 p.1 H2O were added. 
3 X 25 |xl aliquots of each reaction were aUquoted into an Abgene multiwell plate, the 
.plate sealed with optical sealing tape (Biorad), then placed in a Biorad Icycler real- 
• time PGR machine. Reactions were run with the following conditions: 

25 Stepl. 95.0°C 
Step2. 95.0°C 
' Step3. 63.0°C 



3 nain 
for 30 sec 
for 30 sec 
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15 



Data collection and real-time analysis enabled, 

Step4- ' 72.0°C forlSsec ' ' 

60 cycles of steps 2-4. 

'-;Step5. " ' . .Kxr30 sec ' r'-y-'' 

50.0°C ' forSOsec 

Step?. 50.0°C \ for 10 sec 

90 cycles of step 7 TPvilil setpoint temperature increased by 0.5°C after each cycle 
starting with cycle 2. Melt curve data coUection and analysis enabled. 

Expression of A. Jumigatus AF293 2031 OR was seen in cDNA (Ct = 28.8) but only in 
2 of the 3 infected mouse kidney reactions (Ct values = 34.4, 41.2) (Tables VI and 
VH). The product in the other infected kidney cDNA reaction (well A12) was a primer 
dimer or a non-specific product (Tm = 81°C on the melt curve), whereas the correct 
2031 OR product has a Tm of 88.5°C (Tables VI and VB). The negative and 
uninfected kidney controls gave only primer, dimers and non-specific products. 



Table VI: PGR Ouantification Data for SYBR-49n 



Well 


Identifier 


let 


AlO 


infected kidney (250ng) 


34.4 


All 


infected kidney (250ng) 


41.2 


A12 


infected kidney (250ng) 


38 


D02 


Negative 


50.3 


DOS 


Negative 


54.6 


D04 


Negative 


46.2 


H02 


uninfected kidney 


52.8 


H03 


xminfected kidney 


54 


H04 


uninfected kidney 


51.8 


mo 


AF293 (5ng) 


28.7 


Hll 


AF293 (5ng) 


28.7 


m2 


AF293 (5ng) 


30 
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Calculated threshold was replaced by th.e user selected thrediold 5.4. User selected 
baseline cycles were 2 to 10. 



Table Vn. Melt Curve Analysis Spreadsheet Data for SYBR-490 



Well 


Wellldentifier 


Peak ID 


Melt Temp 


AlO 


infected kidney (250 ng) 


AlO.l 


88.5 


All 


infected kidney (250 ng) 


All.l 


88.5 


A12 


infected kidney .(250 ng) 


A12.1 


81.0 


.D2 


Negative 


D2.1 


79.0 


D3 


Negative 


D3.1 


78.0 


D4 


Negative 


D4.1 


78.0 


H2 


uninfected kidney - 


H2.1 


78.5 


H3 


uninfected kidney 


H3.1 


77.5 


H4 


uninfected kidney 


H4.1 


90.5 


mo 


AF293 (5ng) 


HlO.l 


88.5 


Hll 


AF293 (5ng) 


Hll.l 


88.5 


H12 


AP293 (5ng) 


H12.1 


88.5 



5 

Threshold for automatic peak detection was set at 2.09. 



A. Jumigatus 2031 OR is therefore clearly expressed during infection of wax moth 
10 larvae. 2031 OR is only expressed at a very low level during infection of mouse 
-kidney, since increased amounts of template had to be used to give a signal. The ; 
expression during infection suggests that the gene product may be a suitable target for 
an anti-fungal drug. 

15 Example 6. Expression of recombinant 2031 OR and/or fragments 

Recombinant proteins or fragments were expressed to enable detailed study of ftmction 
and as the starting point for the development of a high-throughput screen for inhibitory 
compounds. 
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, 6.1 Production of cDNA constructs 

PGR was carried out using cDNA prepared as described above to generate 

■; polynucleotides encodingSGS l OR sequence essentially corresponding to SEQ ID No. 

5 3. PGR reactions were cairied out using the foUowing reaction mixture and. conditions. 
All Reagents were present in the KOD Mt (Novagen). 

; 2.5 111 .1 Ox PGR Buffer 

5 ixl dNTPs (2mM) 
10 2nlMgS04(25mM) 

1 ^1 primer A (5 pmol) (SEQ ID No. 55; SL_OxXa30F5) 

1 111 primer B (5 pmol) (SEQ ID No. 56; SL-OxXa30R7) 

1 1^1 template cDNA ■ • 

11.5 jLil nuclease-firee water 
15 1 ^1 KOD Polymerase 

PGR reactions were run using the following conditions:- 



Stepl 


94°G 


5 min 


Step2 


94°C 


1 min 


Steps 


59.3°C 


1 min 


Step4 


68°C 


1 min SOsec 


Step5 


68°C 


10 min 


Step6 


10°G 


Hold 



40 cycles of steps 2-4 were carried out and the PGR products were purified using Qiagen's 
QIAquick PGR Purification Kit (Qiagen Ltd, Boundary Gourt, Gatwick Road, Grawley, 
West Sussex, RHIO 9AX, UK) axjcording to the manufacturers instructions. The purified 
PGR products were examined on agarose gels. 

cDNA fragments were then cloned in to the pETSO Xa/LIG vector (Novagen), 
transformed into Nova Blue chemically competent E. coli cells, and plated on to a 
prewanned kanamycin (+) selection plate. After an overnight incubation at 37° C, 
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kanamycin-resistaat colonies were selected and grown up in kanamycia containing LB 
medium. Plasmid DNA was isolated using the Plasniid Mini Kit (Qiagen). CondBrmation of 
the presence and correct orientation of. the inserts was determined by restriction analysis 
and sequencing of the construct, 
5 Purified plasmid DNA, which had been confirmed to be of the correct sequence and 

orientation, was transformed into chemically competent BL21 Star (DE3) One Shot E. coli 
cells and grown overnight at 3T C. 2 ml of an over-night culture were used to innoculate 
100 ml of LB, 30 p^g/ml kanamycin, and the cultures incubated at 37° C, 220 rpm until the 
cell density reached an optical density of 0.5 (approximately 3 hours). Expression of the 

1 0 recombinant protein was then induced with IPTG (ImM) for 5 hours. 

Bacteria were harvested by centrifixgation at 4500 ipm for 10 minutes and the 
pellets lysed in lysis buffer (10 ml Bugbuster (Novagen), 10 |J,1 Benzonase (Novagen), 
0.4 \i\ lysozyme (Novagen) and 100 |xl IM imadazole for 20 minutes at room 
temperature. Cells were then spun down at 16000g for 2.0' at 4° C and the supernatant, 

15 containing soluble recombinant protein, removed to a clean tube. 

Supernatant was added to prewashed Ni-Nta resin at a concentration of 5-10 mg 
protein per ml of resin and allowed to bind for 1 hour at 4° C. Protein-resin mix was 
then poured into a column, washed twice in 4 ml of wash buffer (2.5 ml IM phosphate 
buffer pH8 , 6.25 ml 4M NaCl, 1 ml IM Imidazole pH8, 0.5 ml 10% Tween 20; made 

20 up to 50 mis in n.H20) and then eluted in 4x 0,5 ml fractions with elution buffer (250 
' 111 IM Phosphate Buffer pH8, 625 )li1 4M NaCl, 1.25 ml IM Imidazole pH8, 50 10% 
Tween 20, Made up to 5 mis in n.H20). Fractions containing purijQed protein were 
detected by SDS-Page and Westein blotting using an S-tag HRP conjugate (Novagen). 
Fractions containing purified recombinant protein were concentrated using YMIO 

25 columns (Millipore) 

Figure 3 A shows the induction of recombinant 2031 OR expression by IPTG 
over 24 hours. Protein samples were taken at time points, run on an SDS-PAGE gel 
and stained with coomassie. By 1 hr a band of the correct size was clearly induced 
compared to the uninduced samples. The amount of protein increased with longer 

30 induction times. Figure SB shows a coomassie stained gel of the purified recombinant 
2031 OR. Altemative expression systems can be used for expression in bacteria, such 
as the glutathione S-transferase or mannose-binding fiision-protein system. 
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.Recombinant fragments of ^other 2031 ORs can be generated using the: primer, 
pairs and templates described in Table Vm, or similar primers and other 2031 OR 
listed ia Table HI. . 



^ Table YJR. Primer nai rs for the recombinant expression of 203 1 OR family proteins 



Species 


Template 


Primer A 


PrimerB 


A. fumigatus 


SEQ IDNo. 2 


SEQ ID No. 55 


SEQ ID Nc 


. 56 


A.fumigatus 


SEQ ID No. 5 


SEQ ID No. 57 


SEQIDNc 


.58 


A. fumigatus 


SEQ ID No. 7 


SEQ ID No. 59 


SEQ ID No 


. 60 


A. nidul'ans 


SEQ ID No. 9 


SEQ ID No. 61 


SEQIDNc 


. 62 


C. ablicans-. ■■■ 


SEQ ]D No. 1 1 


SEQ ID No. 63 


SEQ ID Nc 


.64 


M. grisea 


SEQ ID No. 21 


SEQ ID No. 65 


SEQ ID Nc 


. 66 



Example 7. Oxidoreductase assay and iahibitor sc reem'np; 



7. 1 Oxidoreductase assay 
10 The assay for 2031 OR is based on methods described by Abramovitz & Massey 
(1976, J. BioL Chem. 251: 5321-5326) and Stott et al. (1993, J. Biol. Chem. 268: 
6097-6106) and is based upon the ability of this enzyme to oxidise the pyridine 
nucleotides NADH and/or NADPH. The peak of absorbance for the reduced form of 
these cofactors (i.e; NADH and NADPH) is at a wavelength of 340 nm whereas the 
15 oxidised forms of the cofactors (i.e. NAD"^ and NADP"^ do not absorb at tiiis 
wavelength. Conversion of NAD(P)H to NAD(P)"^ can therefore be monitored 
spectrophotometrically at a wavelength of 340 nm. A similar assay can be employed 
for all oxidoreductases that use NADH or NADPH as a cofactor. 

Assays were carried out in 96-well plates. To each well was added the following; 
20 Recombinant 2031 OR (10-1000 ng); 40 |al of 125-2500 jxM NADPH; 1 \xL 100 mM 
cyclohexeneone or other substrate, and the volume made up to 200 jxL witih 0.1 M 
potassium phosphate pH 7.0. Samples were incubated at room temperature and 
absorbance measurements were taken at 340 nm every 30 seconds for 10 min. The 
change in absorbance was expressed as nmoles NADPH oxidised, using the molar 
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extinction coeflBcient of NADPH and NADH at 340nm of 6270 (i.e., a IM solution has 
an optical density of 6270 at this wavelength). 

Initial experiments with a variety of potential substrates for recombinant 2031 
OR showed that the protein had a functional dehydrogenase activity and determined 
5 that cyclohexenone was a better substrate than menadione, dxxroquinone or N- 
ethyhnaleimide. This is illustrated in figure 5. Final concentrations in the assay were as 
follows: 500 iJ,M substrate, 1 |Lig/200 jllL 2031 OR, 120 jj,M NADPH . 

Although the physiological substrates of 2031 OR remain to be determined, 
generic oxidoreductase substrates such as ferricyanide, methylene blue, phenazine 
10 methosulphate and 2,6-dichlorophenolindophenol may also be used to assay for 
oxidoreductase activity. 

Screens for inhibitors of 2031 OR can be carried out using the assay described 
above modified by the addition of putative inhibitor substances to tbe reactions and 
decreasing the amount of potassium phosphate buffer. Assays can be carried out in 
15 384- or 1536-well plates to increase throughput of the screen. 

7.2 High-throughput screen for the identification of 203 1 OR inhibitors 

2031 OR inhibitors were identified by means of a high-throughput screen. The 

following reagents were prepared: 

20 Assay plates: Compounds to be tested were dissolved in 100% DMSO (polypropylene 
vessels), diluted in water and loaded into 384 square well polystyrene plates 
(lOfil/well). The final DMSO concentration'in all assay wells was 5%v/v. 
BNADPH (tetrasodium salt)/2-cyclohexen-l-one reagent; Solutions of NADPH 
(1.2917 mM in 100 mM potassium phosphate buffer, pH7.0) and 2-cyclohexen-l-one 

25 (10 mM in 100 mM potassium phosphate buffer, pH7.0) were prepared on the day of 
the assay and combined in a ratio of 1 part of 2-cyclohexen-l-one solution to 9 parts 
NADPH solution. Final assay well concentrations for NADPH and 2-cyclohexen-l- 
one were 465 |llM and 400 (iM respectively. 

2031 OR enzyme: Recombinant enzyme was prepared as described in Example 6 and 
30 desalted as follows: 2.5 ml of eluted protein was loaded onto on to a PDIO column 
(Amersham) equilibrated with 25 ml of 0. 1 M KPO4 pH7. The protein was then eluted 
with 3.5 ml of 0.1 M K:P04 pH7. Aliquots of the protein were stored at -80°C. For the 
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10 



15 



20 



25 



30 



iscreen, protein was typically diluted to 5 to 11.25. i^g/ml, in 100 mM potassium 
phosphate buffer, pH7.0. 

Stop reagent: 0.4 M NaOH in water. 

5 The Km for 2-cyclohexen-l-one, the substrate for 2031 OR in the screening assay, was 
determined to be 100 pM. To give an increased signal, the screen was earned out using 
2-cyclohexea-l-one at 4 times Km. The kinetics of the screen over the prescribed 
incubation time were such that reaction progress curves were both linear with time and 
protein conceatration. The Z' value for the screen was equal to 0.77 and thus fuUy 
acceptable (Zhang et al., 1999, J. Biomolecular Screening, 4, 67-73). Consistency of 
signal between wells on plates, plate to plate and screen run to' screea run were also 
acceptable for an HTS regime. 

Assays were earned out using Tecan Freedom, Tecan TeMo and PerMnEhner 
Mmitrak robots together with a ThermoLabsystems multidrop 384 and a Tecan Safire 
automated plate reader. 20 ^il of enzyme followed by 20 |li1 NADPH/2-cyclohexen-l- 
one solution were added to wells of the microtitre plates containing test compounds. 20 
^1 of 100 mMpotassimn phosphate buffer, pH7.0 was used for a dupHcate set of plates 
for background no-enzyme controls; DMSO (diluted in the same way as solubilised 
compound stocks) was used for no-compound controls. Plates were incubated at room 
temperature for 30 minutes after which 25 ptl of 0.4 M NaOH stop reagent was added. 
Plates were read at 340 mn on a Tecan Safire plate reader and data processed using 'in- 
house' created Excel spreadsheets to convert raw data into percent inhibiton data. 
Secondary screens were carried out to measure dose response data for selected 
compounds, using essentially the same protocol as the pimaiy screen. The secondary 
screen used the Excelfit version 3 software (IDBS), with sigmoidal model 606, to 
graph appropriate inhibition values and determine IC50 data for compounds tested. 
Figure 6 shows typical results for 2 inhibitory compounds (A and B) identified by the 
primary screen and then assayed in the secondary screen. 

Identification of the correct stop reagent for the HTS assay was not trivial. 
Initially, a chemical inhibitor of the system was sought to temnnate the reactions in a 
pH independant manner, but it was found that NaOH offered more benefits than 
originaUy anticipated, in that it not only overcame the buffering in the reaction to fuUy 
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tenninate the reaction, but also afforded a xauch greater protection for un-reacted 
NADPH. It is knowQ that high levels of NaOH convert NADP, a product of the 
reaction which does not absorb at 340 nm, to a fluorescent product, which would 
interfere with the 340 nm readings taken (Passonneau and Lowry, 1993, Enzymatic 
analysis, a practical guide, pp.3-21 and p381. 1993 The Humana Press Inc. NJ USA.). 
Therefore, the NaOH level used in the HTS assay was chosen such that the amount of 
fluorescence from NADP conversion was reduced to an insignificant level, whilst fully 
terminating the reaction. The greater stability of the NADPH afforded by the use of 
NaOH meant that instead of immediate plate readings, plates could be read up to at 
least 20 hours post reaction termination (no further extended time points were 
investigated). This was an obvious advantage in that larger screens could be run. 
Plates stored for spectrophotmetric reading were sealed with self adhesive film and 
stored in the dark. „ . / 

Example 8. Method for detecting fungal infection 

The sequences described in the iavention were exploited to diagnose fungal 
infections. Samples firom patients potentially carrying an infection with A. fumigatus, 
A. nidulans, or C. albicans or rice leaves or stem potentially infected with M grisea, or 
of alfalfa infected with C. trifolii, or wheat infected with F, graminearum, F. 
sporotrichioides, ox M. gi^antinicola, or other organisms, are processed to extract 
DNA using the DNAeasy Tissue kit or QIAamp DNA Blood Mini kit (Quiagen, 
Crawley, UK), although other DNA preparation methods are available and suitable. 
Once DNA has been prepared, PGR reactions are set up as follows: 

Reaction mix: 

12.5 |Lil 2x ReddyMix PGR mastemux (ABgene) 

1 jLil primer A (5 pmol) 

1 |al primer B (5 pmol) 

5 \Jil template DNA 

5.5 \i\ nuclease-free water 
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Suiable piimer .pairs are given in the table DC below: 

Table IX. Primer pairs for PCRs to diagnose fungal infection. 



Species 



A. fumigatus 



A, fumigatus 



A. fumigatus 



C. ablicans 



M. grisea 



Template 



SEQ ID No. 1 



SEQ ID No. 4 



SEQ m No. 7 



SEQ ID No. 11 



SEQ ID No. 20 



Primer A 



SEQ ID No. 67 (94) 



SEQ ID No. 69(239) 



SEQ ID No. 71 (1097) 



SEQ ID No. 73 (103) 



SEQ ID No. 75 (385) 



Primer 



SEQ ID No. 68 (286) 



SEQ ID No. 70 (450) 



SEQ ID No. 72 (1271) 



SEQ ID No. 74 (277) 



SEQ ID No. 76 (620) 



Figures in brackets after SEQ ID No. indicate the base in the template at which the 
primer starts. 

Appropriate controls include; (i) template DNA but no primers; primers but no 
template (negative controls); (ii) cDNA encoding fungal 2031 OR or DNA from 
cultured fungi instead of patient DNA (positive control), 

PGR reactions are run as follows: 



Stepl 


95°C 


5 mm 


Step2 


95°C 


1 min 


Steps 


53°C 


1 min SOsec 


15 Step4 


- 72°C 


1 min SOsec 


Steps 


72°C 


10 min 


Step6 


4°C 


Hold 



20 



25 



30 cycles of steps 2-4 are carried out and the PGR products examined on agarose gels. 
The production of a band of the correct molecular weight is diagnostic of the presence 
of the particular fungus. It may be additionally necessary to carry out diagnostic 
restriction digests of the PGR products. If necessary, PGR products are subcloned into 
a vector, such as pGEM-Teasy (Promega), aad sequenced to verify that the PGR 
products are from the appropriate fungus. 

Alternatively, the presence of an infection with A. fumigatus, A. nidiilans, C 
albicans or M grisea, C. trifolii, F, graminearum, R sporotrichioides or M 
graminicola, or other organisms is detected by means of antibodies raised against the 
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fimgal protein. One suitable means is the use of a capture ELISA. Here, microtitre 
plates are coated with a monoclonal antibody raised against the fungal protein. Then 
the plates are incubated with diluted patieat samples, or appropriate protein extracts of 
samples (particularly if the samples are biopsies or jjlant tissues). Plates are then 
5 incubated with a polyclonal antibody (again against the fungal protein). Finally, 
binding of the second antibody was detected by means of an enzyme-coupled or 
• fluorescently-labelled antibody directed against the polyclonal. -In practise, two 
monoclonal or polyclonal antibodies or various combinations may be used. 

10 Example 9. Production of an antibody 

Antibodies against the fungal 2031 ORs will be of considerable use as diagnostic 
reagents (see example 8 above). As an immunogen, recombinant domains are used (as 
described in Example 6). Alternatively, synthetic proteins encoding regions either 
1 5 unique to the individual 203 1 ORs, or likely to provide cross-reactivity within a set of 
ORs, a set of species, or a range of genera are used. Peptides may need to be 
conjugated to carrier proteins before immunization. 

Preimmune sera from animals to be immunised are screened against the 
immunogen to ensure that there is no endogenous cross reactivity. Animals (typicaUy 
20 sheep, rabbits or mice) are then immunised. For polyclonal antibody production, the 
resulting sera is affinity purified using the immunogen cross-Unked to a 
chromatography matrix. Alternatively, purification of the antibody fraction from the 
serum, e.g. using protein G or protein A cross-Hnked to a matiix, may be sufficient. 
Monoclonal antibody production proceeded by methods famiUar to those skiHed in the 
25 art. 

The specificities of the resulting polyclonal and/or monoclonal antibodies are 
checked by ELISA aad/or western blotting usiag the immunogen, related constructs or 
whole cell lysates and extracts as targets. Negative controls, such as other ORs, 
different constricts or different species are also employed to test specificity and/or to 
3 0 determine the range of species and/or genus cross-reactivity. 



Example 10. Production of funei with 2031 OR genes functionally di.qahlftd 
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A BAG (bacterial artificial chromosome) clone library containing the A. fumigatus 
genome, partially digested with BwaHI and inserted into the vector pBACe3.6 was 
purchased .&om the Sanger Centre, Cambridge, UK. The BAG clone containing the 
5 gene to be inactivated is identified by bioinformatics (BLAST searching of Sanger 
BAG and related databases) and the glycerol stock of the clone grown up in 50 ml LB, 
20 [i.g/m1 chloramphenicol at 3TC overnight. The overnight culture is centrifuged at 
4,500 rpm for 15min. The bacterial pellet is resuspended in 4 ml of Buffer PI (Qiageri 
plasmid miniprep kit) and then 4 ml of buflfer P2 (Qiagen plasmid niiniprep kit, lysis 

10 buffer) is added and mixed gently by inverting 3-6 times. Proteins and genomic DNA 
are precipitated by adding 4 ml of buffer P3 (Qiagen plasmid miniprep kit, neutralizing 
buffer) and incubating on ice for 10 minutes. Following the centrifugation of the 
mixture at 4500 rpm for 30 min, the supernatant is transferred into a 50 ml falcon tube, 
an equal volume of phenol/chlorophorm (1:1) mixture is added, and the mixture 

15 . centrifixged for 15 min at 4500 rpm. The supernatant is then transferred into an 
Oakridge tube and 0.7 volumes isopropanol are added. After mixing, the tube is 
centrifixged at 10,000 ipm (Beckman centrifiage, rotor JA-17) for 30 min at 4^G. The 
restating pellet is washed with 2 ml 70% ethanol at the same speed. The resulting BAG 
DNA is resuspended in 100 |ll1 buffer EB. 

20 The transposition reaction is carried out as follows. 7 |j,1 purified BAG, 1 fil 

transposon pZVICZ (an engineered plasmid the sequence of which is given as SEQ ID 
No. 81), containing the mosaic ends of pMOD2 (Epicenter), a kanamycin resistance 
gene and a Zeocin resistance gene under the control of fungal promoter) and 1 ^il 
EZ:TN transposase (Epicenter) are incubated at 3TC for two hrs after which 1 lal stop 

25 solution (1% SDS) is added and the mixture heated to 70**G for 10 minutes. 
Electrocompetent GeneHogs E. coli cells (Invitrogen) are then transformed with the 
transposed BAG, the cells plated onto LB agar, 25 jag/ml kanamycin, 20 |J,g/ml 
chloramphenicol, and plates iacubated overnight at 37°G. 

At least 96 colonies are picked and grown up in 96-well plates in 2xLB (double 

30 concentrated LB), 20 fig/ml chloramphenicol, at 37*'G overnight. BAG DNA is then 
purified using the Millipore montage 96 BAG KIT using a MWG ROBOSEQ 4200 
robot BAGs containing the transposon inserted into the gene of interest are identified 
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by PCRs both spaming the gene of interest and extending from the transposon into the 
BAG. Insertion into the gene of interest is manifested as an increase in product size. 
Southern blots are also carried out to ensure that the transposon has only inserted once 
into the BAG. 

i . The BAG is then linearised using a restriction enzyme determined to cut in the 

vector backbone but not the BAG DNA, and used to transform A. fumigatus strain 
Af293. A. fumigatus (haploid) protoplasts are prepared using 5% Glucanex (Novo 
Nordisk A/S) solution (in 0.6 M KCl) and shaking for 2 h at 80 rpm in 30°G . The 
protoplasts are washed with 0.6 M KCl and then with STG (Sorbitol, Tris, GaGt). The 
washed protoplasts are diluted in STG to 10^/ml and 100 ^1 transfeixed into 14 ml 
falcon tubes. 7 ^.1 of linearised BAG are added to the tube and the whole mixture 
incubated on ice for 20 min. Transfoimation is earned out by adding 200 ^il of PEG 
8000 solution (60%w/v, pH 7.5) drop-wise over 2 min and then adding 800 pil PEG. 
The mixture is left at room temperature for 20 min. Transfomied protoplasts are 
washed with STC, resuspended in 1 ml STC, spread onto CM-sorbitol- Zeocin (250 
p,g/ml) plates and incubated dt 37 ° C. 

After 4-10 days of incubation, zeocin resistant colonies are picked and checked 
for presence of the knocked-out gene by PGR using primers which specifically ampKfy 
the whole gene of interest. Usually 10-20 transfoimants are checked. The ectopic 
integration of the BAG gives two bands by PGR, one for the endogenous gene and one 
for the BAG/transposon construct, which has a higher molecular weight. Replacement 
of the endogenous gene with the transposon-modified gene results in a single band of 
higher molecular weigh by PGR. If none of the transfoimants show the disrupted 
endogenous gene, the gene of interest may be essential, with the knock-out ceUs 
having .died and only cells where replacement is unsuccessful surviving. In this case, 
the transformation is carried out on diploids using Hie same method of transfoimation.' 
Essentiality of the gene is then tested by rehaploidisation, and examining the 
segregation pattern in haploids. 
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Example 11. Rescue of MvcoBanktransfnr tnant with tb^ 2031 oxidoreductasR 



gene. 



11.1 Preparation of the. 7031 OR construct 
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The 203 1 OR gene with Nhel overhangs was prepared by PGR using the primer pair; 
SEQ ID No 98 and SEQ ID No. 99. 

PGR Reaction: 2.5 jlxI lOx PGR buffer 
0.5 |Lxl dNTPs 
2[xlMgS04 

1 |Lil forward primer (SEQ ID No. 98) 
1 fxl reverse primer (SEQ ID No. 99) 
1 |Lil gDNA 

Made up to 25 \xl with n.H20 

PGR Gycle: (1) 94° G, 5'; (2) 94° G, 1'; (3) 50° Q V; (4) 68° G r30s; (5) 68° G, 10'; 
(6) 8° G, Pause; Gycles 2 to 4 were repeated 40 times 

The finished amplicon (—1260 bp) was nm out on a 1% agarose gel, the appropriate 
band was cut out and purified using the Qiagen gel extraction kit and eluted off the 
column in 30 fal H2O. The amplicon was Hgated into pGEM Teasy using the following 
reaction mixture: 

5 (xl 2x Kgation buffer 

1 III pGEM Teasy vector 

either 1, 2 or 3 jil of insert 

1 fxl T4 DNA ligase 

Reaction made up to 10 |j,l with n.H20 

The ligation reaction was incubated overnight in the fiidge 

2 lull of each Ugation reaction was transformed by heatshock at 42°G into 
promega 96 select cells. After transformation, cells were incubated in SOG for 1 h at 
37° G, 220 tpm. 50 and 150 jal ahquots were then spread over LB-Amp (100 fxg/ml), 
IPTG-Xgal plates and left at 37° G overnight. Positive clones were identified by 
blue/white screening and were isolated and screened by PGR for correct insertion of 
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the 2031 OR insert using the above primers. Positive clones were sent away to MWG 
for sequence analysis. 

1 1 .2 ClmiTn ^ of 2031 OR into the CbhB-Zeo vector 

5 Plasmid DNA for 2031 OR in pGem Teasy (as described in 11.1) was digested 
ovemi^t at 37° C with Nhel. The 2031 OR insert firagment was then gel purified 
using the Qiagen gel extraction kit and ligated into CbhB-Zeo vector. This vector was 
constnicted from pUC19 with the A. fumigatus CbhB promoter and terminator and the 
zeocin resistance gene. 

10 

Ligation: 1 \xl of T4 DNA ligase 

1 jal of lOx ligase buffer 

1 iLil of CbhB vector (linearised and alkaline phoshatase treated) 
1 |Lil of insert 
15 6fxln.H20 

, Ligation reaction was left in the fridge overnight. ' 

2 yls of each hgation reaction was transformed by electroporation at 2.5 Kvolts, 200 
20 Q, 25|iF into Genehog cells. After transformation, cells were incubated in SOC for 1 h 
at 37° C, 220 ipm. 50 and 150 jal aliquots were then spread over LB-Amp (100 |LLg/ml) 
plates and left at 37° C overnight. Positive clones were isolated and screened by PCR 
for the correct insertion of the insert by PCR as above. Positives were s,ent to MWG 
for sequence analysis. 

25 

1 1 .3 Transformation into Mvcobank mutant 203 1 

The CbhB-Zeo-2031 plasmid was digested with Seal overnight at 37° C. Linearised 
plasmid was then run out on a 1% agarose gel and purified using the Qiagen gel 
extraction kit. Plasmid DNA was eluted in 30 fils of nH20. 
30 Mycobank mutant 2031 AF293 spores were swollen for 6 h at 37° C, 300 

rpm, centrifiiged 3500 ipm, 5' and resuspended in ice-cold nHaO. Spores were spun 
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again, 3500 rpm, 5' then resiispended in 12.5 ml of YED mediiom and incubated for Ih 
at 30° C, 100 rpm. Spores were then counted and resuspend in EB buffer to a final 
concentration of 5x10^ spores per ml. 50 |ll1 of swollen spores were . then transformed 
with 1-10 |Lil of linearised CbhB-Zeo-2031 plasmid DNA at 1 Kvolt, 400 25 pp. 
Spores were transferred in to YED buffer and left for 90' at 37° C, 100 rpm. 100 and 
200 |ll1 aliquots were then spread out on to CM-Zeocin (200 |-Lg/mi) plates and 
incubated at 37° C for 2-3 days. 

Positive transformants on the CM-Zeo plates were picked into 5 ml of SAB 
broth and incubated overnight at 37° C, 220 rpm. Biomass was then filtered and 
collected on to Whatman paper. DNA was extracted using the Fast prep kit and 
cleaned up over a Qiagen roiniprep DNA colunm. DNA was eluted off column in 30 
|Lil of rLB20. 

PGR Screening was performed using the following primer sets: 
Set A: Ox7race_for (SEQ ID No. 51) + CbhBtR (SEQ ID No. 100) 
Set B: Ox6race_rev (SEQ ID No. 50) + CbhBpF (SEQ ID No. 101) 

PGR Reaction: 12.5 fxl 2x Reddy mix 

1 |Lil each primer, firom set A or B 

1 p,l plasmid DNA 
Made up to 25 uL with water 

PGR Gycle: (1) 94° G, 5'; (2) 94° G, 1'; (3) 56° Q V; (4) 72° G I'SOs; (5) 72° G, 10'; 
(6) 8° G, Pause; Gycles (2) to (4) were repeated 40 times 

Positive transformants which were demonstrated to have GbhB-Zeo-2031 in 
Mycobank mutant 2031 were put through the rehaploidation process to test their ability 
to grow on hygromycm compared with the untransformed mycobank mutant 203 1, We 
found that the lethal 2031 phenotype was rescued by the insertion of the GbhB-Zeo- 
2031 plasmid, confirming the essentially of 2031 OR. 

The reader's attention is directed to all papers and documents which are filed 
concurrently with or previous to this specification in connection with this application 
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and which are open to public iQspection with this specification, and the contents of all 
such papers and documents are incorporated herein by reference. 

All of the features disclosed in this specification (including any accompanying 
claims, abstract and drawings), and/or all of the steps of any method or process so 
5 disclosed, may be combined in any combination, except combinations where at least 
some of such features and/or steps are mutually exclusive. 

Each feature disclosed in this specification (including any accompanying claims, 
abstract and drawings), may be replaced by alternative features serving the same, 
equivalent or similar purpose, unless expressly stated otherwise. Thus, unless 
10 expressly stated otherwise, each feature disclosed is one example only of a generic 
series of equivalent or similar features. 

The invention is not restricted to the details of the foregoing embodiment(s). The 
invention extends to any novel one, or any novel combination, of the features disclosed 
in this specification (including any accompanying claims, abstract and drawings), or to 
15 any novel one, or any novel combination, of the steps of any method or process so 
disclosed. 
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SeqTience Lxs-bing 



SEQ ID No 1 

GTTCGACGTCATTGCCACGTTTCGACCCAAGGGCAGACGCCATGTCGCCGAGCGATCGCCGCGATATGCCTCGAATT 
TGCGCCATTCGGCATCCAGTTTCCAGTGCCCTTCCCCGAATGACTGTCTCCAGTATTCGGCAAGATTGTAZiATCAAG 
CCTGAAGAAGCGGAGCAATTCTTGGAAGTCGTATGTTCTACTGATTTCTGTGCCTGGCGCAGACGGGTATATAAATA 
AAGATCACCGCACCGAGGAGTTTCTTACCAACCCATCAATAACCATCCACAATCTCCTACAACAAAAATGACTGTCG 
,CCGATATCGACGTTCCTCCTGCCGAGGGCATCCCCTACTTCACTCCGGCCCAGAACCCTCCTGCCGGTACGGCAGCT 
AACCCCCAGACCAATGGCCAGAAGATCCCCAAGCTCTTCACGCCCTTGACCATCCGTGGCGTCACCTTCCAGAACCG 
CCTTGGTGTAAGTCCGTTTGCCCTTGCTCATATCGACGAAAGCTAATCCCCCGTCAGCTCGCGCCCCTCTGCCAATA 
' CTCCGCCCAGGACGGCCACATGACCGACTACCACATCGCCCATCTGGGTGGGATCGCCCAACGCGGACCCGGCCTGA 
TGCTGATTGAGGCGACCGCCGTCCAGCCCGAAGGCCGCATCACCCCTCAGGATGTCGGTCTGTGGAAGGACTCCCAG 
ATCGCCCCGATGCGCCGGGTCATCGACTTCGTGCACAGCCAGGGCCAGAAGATCGGCGTGCAGCTTGCCCATGCCGG 
CCGGAAAGCCACCACCGTTGCGCCCTGGATCTCATTCTCGGCCATCGCGACGGAGAAGGTCGGCGGATGGCCGGACC 
GCGTCAAAGGGCCCGGGGATATCCCCTTTGCGGAGCCCTTCGCCAAGCCCAAGGCCATGACGCTGGATGAGATCGAG 
CAGTTCAAGAAGGACTGGGTGGCGGCCACGAAGCGCGCCATCGCCGCCGGTGCGGACTTTGTCGAGATTCACAATGC 
GCATGGATACCTGCTGTCGTCATTCCTCTCGCCGGCCGCCAACAACCGCACGGACCAGTACGGCGGGTCGTTCGAGA 
ACCGCATCCGGCTGTCTCTCGAGATTGCGCAGTTGACTCGGGACGCCGTCGGCCCTCATGTGCCCGTTTTCCTGCGC 
ATTTCGGCCTCGGACTGGTGCGAGGAGACCCTGCCGGAGCAGAGCTGGAAGTCGGAGGATACCGTGCGGTTCGCGCA 
GGAGCTGGTCAAGCAGGGCGCCGTTGATCTGATCGATATCAGCAGCGGTGGTGTTCTCGCGCAGCAGAAGATCAAGT 
CCGGCCCTGCCTTCCAGGTGCCTTTTGCCGTGGCCGTGAAGAAGGCCGTCGGCGACAAGCTGCTGGTTGCCGCCGTG 
GGTGCCATCACCAACGGCAAGCAGGCGAATCAGATTCTAGAGGAGCAGGATATCGACGTTGCGCTGGTTGGCCGTGG 
GTTCCAGAAGGATCCCGGTCTGGCCTGGACGTTTGCTCAGCACCTCGGCGTCGAAATCTCCATGGCCAACCAGATCC 
GCTGGGGCTTCACCCGGCGTGGAGGCACCCCGTACATTGATCCTTCGGTGTACAAGCAGTCTATTTTCGATGTATAG 
AGTATAGATAGAGTTGAAGATGATACCTCATAGACGATCAATGGACCCTTGCATATTATTTCTCGTCTCCTGCGTAT 
GTTCAAGGTATTCACAGTAGCTGCGTCCTCTTAAGTTTCTCCGTCATTCGTTCTATTCTACTCCAATCGCAACGCAT 
GGCGACCACGGATCGAGTCGAATTTCTCCGTCGTTCGTATCTGATCAATATAAAAAGCGGGGAATGGCTTGACCCCG 
CGCAGAATGTCGATCTCTTCGCAAACTCTCGGTGTATAGGACGCTCAGCAACGATCAAGG 



SEQ ID No 2 

GTATGTTCTACTGATTTCTGTGCCTGGCGCAGACGGGTATATAAATAAAGATCACCGCACCGAGGAGTTTCTTACCA 
ACCCATCAATAACCATCCACAATCTCCTACAACAAAAATGACTGTCGCCGATATCGACGTTCCTCCTGCCGAGGGCA 
TCCCCTACTTCACTCCGGCCCAGAACCCTCCTGCCGGTACGGCAGCTAACCCCCAGACCAATGGCCAGAAGATCCCC 
AAGCTCTTCACGCCCTTGACCATCCGTGGCGTCACCTTCCAGAACCGCCTTGGTCTCGCGCCCCTCTGCCAATACTC 
CGCCCAGGACGGCCACATGACCGACTACCACATCGCCCATCTGGGTGGGATCGCCCAACGCGGACCCGGCCTGATGC 
TGATTGAGGCGACCGCCGTCCAGCCCGAAGGCCGCATCACCCCTCAGGATGTCGGTCTGTGGAAGGACTCCCAGATC 
GCCCCGATGCGCCGGGTCATCGACTTCGTGCACAGCCAGGGCCAGAAGATCGGCGTGCAGCTTGCCCATGCCGGCCG 
GAAAGCCACCACCGTTGCGCCCTGGATCTCATTCTCGGCCATCGCGACGGAGAAGGTCGGCGGATGGCCGGACCCGC 
GTCAAAGGGCCCGGCGATATCCCCTTTGCGGAGCCCTTCGCCAAGCCCAAGGCCATGACGCTGGATGAGATCGAGCA 
GTTCAAGAAGGACTGGGTGGCGGCCACGAAGCGCGCCATCGCCGCCGGTGCGGACTTTGTCGAGATTCACAATGCGC 
ATGGATACCTGCTGTCGTCATTCCTCTCGCCGGCCGCCAACAACCGCACGGACCAGTACGGCGGGTCGTTCGAGAAC 
CGCATCCGGCTGTCTCTCGAGATTGCGCAGTTGACTCGGGACGCCGTCGGCCCTCATGTGCCCGTTTTCCTGCGCAT 
TTCGGCCTCGGACTGGTGCGAGGAGACCCTGCCGGAGCAGAGCTGGAAGTCGGAGGATACCGTGCGGTTCGCGCAGG 
AGCTGGTCAAGCAGGGCGCCGTTGATCTGATCGATATCAGCAGCGGTGGTGTTCTCGCGCAGCAGAAGATCAAGTCC 
GGCCCTGCCTTCCAGGTGCCTTTTGCCGTGGCCGTGAAGAAGGCCGTCGGCGACAAGCTGCTGGTTGCCGCCGTGGG 
TGCCATCACCAACGGCAAGCAGGCGAATCAGATTCTAGAGGAGCAGGATATCGACGTTGCGCTGGTTGGCCGTGGGT 
TCCAGAAGGATCCCGGTCTGGCCTGGACGTTTGCTCAGCACCTCGGCGTCGAAATCTCCATGGCCAACCAGATCCGC 
TGGGGCTTCACCCGGCGTGGAGGCACCCCGTACATTGATCCTTCGGTGTACAAGCAGTCTATTTTCGATGTATAGAG 
TATAGATAGAGTTGAAGATGATACCTCATAGACGATCAATGGACCCTTGCATATTATTT 



SEQ ID No 3 

MTVADIDVPPAEGIPYFTPAQNPPAGTAANPQTNGQKIPKLFTPLTIRGVTFQNRLGLAPLCQYSAQDGHMTDYHIA 

HLGGIAQRGPGLMLIEATAVQPEGRITPQDVGLWKDSQIAPMRRVIDFVHSQGQKIGVQLAHAGRKATTVAPWISFS 

AIATEKVGGWPDRVKGPGDIPFAEPFAKPKAMTLDEIEQFKKDWAATKRAIAAGADFVEIHNAHGYLLSSFLSPAA 

NNRTDQYGGSFENRIRLSLEIAQLTRDAVGPHVPVFLRISASDWCEETLPEQSWKSEDTVRFAQELVKQGAVDLIDI 

SSGGVLAQQKIKSGPAFQVPFAVAVKKAVGDKLLVAAVGAITNGKQANQILEEQDIDVALVGRGFQKDPGIJVWTF 

HLGVEISMANQIRWGFTRRGGTPYIDPSVYKQSIFDV 

SEQ ID No 4 

atgtcgcaacctgttgtgcctgacatcgagaacaaacccgcgccgggtatctcgtactttactccggcg.caagagcc 
gcctgctggcaccgctgctaatcctcagtctgatggatcggcacctcccaagctcttccggccgctttcggtgcggg 
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gtctgacctttcacaatcgcattggcgtgagtgcagtccaggcaattatgctatccatcctatgcgagcccttgcat 
tggaacagccgcttacagggaatgataatgagtagctatcgccactctgccaatactcagccgacgatggacacatg 
actccctggcatatggcacatcttggagggattgcccagcgagggccaggattcttgatggtcgaggcaacagcagt 
cgaaccggaaggcaggatcaccccgcaggacctgggactatggaaagactcgcagattgagccattgagccgcgtga 
5 tcgagtttgtccacagtcagaaccagcttatcggcgtgcagatcgcacacgcaggtcgcaaggccagcaccgtcgcg 
ccatggctctcggccaacgataccgcctccgagaagatgggcggctggccaggccgcgtcaaaggcccgacaaatgt 
gcccttcaccgttaagaaccctgtgccgaaggagatgaccaagcaggatatcgaggatctgaagaccgcctgggtgg 
ccgctgtcaaacgggctgttaaggccggagccgactttatcgagatccacaatgcgcatggctatcttctgatgtcg 
ttcctctcccctgcggtcaacacgagaacagacgagtacggaggcagttttgagaatcgcatccggctcagtctgga 

10 gatcgccaagctcacccgcgaaaatgtgcccaaggatatgcctgtcttcctgcgggtctccgccaccgattggctgg . 
aggaggtgcagccgaacaagcccagctggcgaggcgtggacactgtccgatttgcgaagatcctggcagaaacgggt 
tacgttgacgtgcttgacgtgagcagtggcggcactcattcggagcagcatatccacgcgaagccaggcttccaggc 
accctttgctattgccgtcaagaacgccgtcggggacaaactcgcagtggcatcagtgggtatgattgccagcgcgc 
atttggccaattccttgttggagaaggacggactggaccttgtgctggttggacgtggcttccagaagaacccgggg 

15 ctggtgtgggcgtgggccgacgagctgaatgtagagatctccatggctaatcagatccgatggggtttctcgcggcg 
cggtgctggtccttacctcaggaagaaactcgagaagatataa 

SEQ ID No 5 

ATGTCGCAACCTGTTGTGCCTGACATCGAGAACAAACCCGCGCCGGGTATCTCGTACTTTACTCCGGCGCAAGAGCC 

20 GCCTGCTGGCACCGCTGCTAATCCTCAGTCTGATGGATCGGCACCTCCCAAGCTCTTCCGGCCGCTTTCGGTGCGGG 
GTCTGACCTTTCACAATCGCATTGGCCTATCGCCACTCTGCCAATACTCAGCCGACGATGGACACATGACTCCCTGG- 
CATATGGCACATCTTGGAGGGATTGCCCAGCGAGGGCCAGGATTCTTGATGGTCGAGGCAACAGCAGTCGAACCGGA 
AGGCAGGATCACCCCGCAGGACCTGGGACTATGGAAAGACTCGCAGATTGAGCCATTGAGCCGCGTGATCGAGTTTG 
TCCACAGTCAGAACCAGCTTATCGGCGTGCAGATCGCACACGCAGGTCGCAAGGCCAGCACCGTCGCGCCATGGCTC 

25 TCGGCCAACGATACCGCCTCCGAGAAGATGGGCGGCTGGCCAGGCCGCGTCAAAGGCCCGACAAATGTGCCCTTCAC 
CGTTAAGAACCCTGTGCCGAAGGAGATGACCAAGCAGGATATCGAGGATCTGAAGACCGCCTGGGTGGCCGCTGTCA 
AACGGGCTGTTAAGGCCGGAGCCGACTTTATCGAGATCCACAATGCGCATGGCTATCTTCTGATGTCGTTCCTCTCC 
CCTGCGGTCAACACGAGAACAGACGAGTACGGAGGCAGTTTTGAGAATCGCATCCGGCTCAGTCTGGAGATCGCCAA 
GCTCACCCGCGAAAATGTGCCCAAGGATATGCCTGTCTTCCTGCGGGTCTCCGCCACCGATTGGCTGGAGGAGGTGC 

30 AGCCGAACAAGCCCAGCTGGCGAGGCGTGGACACTGTCCGATTTGCGAAGATCCTGGCAGAAACGGGTTACGTTGAC 
GTGCTTGACGTGAGCAGTGGCGGCACTCATTCGGAGCAGCATATCCACGCGAAGCCAGGCTTCCAGGCACCCTTTGC 
TATTGCCGTCAAGAACGCCGTCGGGGACAAACTCGCAGTGGCATCAGTGGGTATGATTGCCAGCGCGCATTTGGCCA 
ATTCCTTGTTGGAGAAGGACGGACTGGACCTTGTGCTGGTTGGACGTGGCTTCCAGAAGAACCCGGGGCTGGTGTGG 
GCGTGGGCCGACGAGCTGAATGTAGAGATCTCCATGGCTAATCAGATCCGATGGGGTTTCTCGCGGCGCGGTGCTGG 

3 5 TCCTTACCTCAGGAAGAAACTCGAGAAGATATAA 

SEQ ID No 6 

MSQPWPDIENKPAPGISYFTPAQEPPAGTAANPQSDGSAPPKLFRPLSVRGLTFHNRIGLSPLCQYSADDGHMTPW 
HMAHLGGIAQRGPGFLMVEATAVEPEGRITPQDLGLWKDSQIEPLSRVIEFVHSQNQLIGVQIAHAGRKASTVAPWL 
40 SMTDTASEKMGGWPGRVKGPTNVPFTVKNPVPKEMTKQDIEDLKTAWVAAVKRAVKAGADFIEIHNAHGYLLMSFLS 
PAVNTRTDEYGGSFENRIRLSLEIAKLTRENVPKDMPVFLRVSATDWLEEVQPNKPSWRGVDTVRFAKILAETGYVD 
VLDVSSGGTHSEQHIPIAKPGFQAPFAIAVKNAVGDKLAVASVGMIASAHLANSLLEKDGLDLVLVGRGFQKNPGLVW 
AWADELNVEI SMANQIRWGFSRRGAGPYLRKKLEKI 

45 SEQ ID No 7 

ATGGGTTCCAACGCCTTCCGGTCCCCCGCCGTCACCAAGTCCTCCTCCACCCCCTACTACACTCCCGCCAACAATGG 
AGGCGCCGCCCTGCACCCCGACGACCCCACGACCCCTACGCTCTTCCGGCCCTTACAAATCCGCAATGTGACGCTCA 
AGAACCGCATCATGGTGTCGCCCATGTGCATGTACTCCTGCGAGTCGGACGCGTCGTCTCCCCACGTCGGCGCCCTA 
ACAAACTACCACCTGGCGCATCTGGGCCACCTCGCCCTCAAAGGCGCAGGCCTCGTCTTCATCGAAGCCACCGCCGT 

50 GCAGCCCAACGGGCGCATCTCCCCCAACGACTCGGGCCTCTGGCAGGACGGCACCACCTCGGAACAATTCGTGGGGC 
TGAAGCGGGTCGTCGAGTTCATGCACGCACAGGGCGCCAAGGTCGGGATCCAGCTTGCGCATGCGGGCCGGAAAGCG 
AGTGCCGTTGCGCCGTGGCTGGCGGCGCAGGCGGGCAAGTCGAGTCTGAAGGCGGATGAGAGCGTTGGCGGGTGGCC 
CGCGGATGTGGTGGGTCCGTCGGGCGGGGAGGAGCATATCTTTAGTCCCGAGGAGGATGCGTATTGGGTGCCGCGGG 
CGCTGAGCACGGCCGAGGTCCGTCAGGTGGTGGCGGCGTTTGCGAAGAGCGCGCGGCTAGCGGTGCAGGCTGGGGTG 

55 GATGTTATCGAGATCCATGGGGCGCATGGCTATCTCATCAACGAGTTCCTGAGCCCGGTCACGAATAAGCGGACGGA 
TGCGTACGGCGGGAGCTTTGAGAACCGGACCCGGATCGTGCGCGAGGTTGCGGCGGCTATTCGTGCGGTGATTCCCG 
AGGGGATGCCCCTGTTTCTGCGTATCAGCGCCACGGAGTGGTTGGAGGGTCAGCCGGTGGCCGCGGAGTCGGGCAGC 
TGGGATATGCAGAGCTCGCTGGAGCTGGTCAAGAAGCTGCCCGAATGGGGCATTGACCTGGTGGATGTCAGCTCCGC 
CGCGAACCACAAGGACCAGAAGATCAACCTGCACACGGCCTACCAGACGGACCTGGCCGGGCAGATTCGCCAGGCCA 

60 TCCGAGCGGCTGGCGCGTCGACTCTTGTGGGTGCTGTAGGTCTGATCACCGATTCGGAACAGGCGAGGGGACTAGTT 
CAGGGAGCGGACGAGGCGACTGCAGCCGAGGCAATGCTGTCGGGACCTGAACCCAAGGCGG^lTGCCATTCTGATAGC 
CCGTCAGTTCCTGCGCGAGCCAGAATGGGTGTTTTCCACGGCGAGAAAGTTGGGCGTGCCGGTGACTGTCCCGGTGC 

AGTTTGGCAGGGCCATTTAG 
65 SEQ ID No 8 



, ■ MGSNAFRS PAVTKS S S TPYYTPiUSTNGGAALHPDDPTTPTLBllPLQIRNVTLKNRIMVS PMCMYS CE S DP S S PHVGAL 

■ TNYHLAHLGHIJiiKGAGLVFXEATAVQPNGRISPNDSGLWQDGTTSEQFLGLKRVVEFiy^ 
SAVAPWLAAQAGKSSLPGADESVGGWPADWGPSGGEEHIFSPEEDAYWVPRALSTAEVRQWAAFAKSM 
DVIEIHGAHGYLINEFLSPVTNKRTDAYGGSFENRTRIVREVAAAXRAVIPEGMPLFLRISATEWLEGQPVAAESGS 
WDMQSSLELVKKLPEWGIDLVDVSSAANHKDQKINLHTAYQTDLAGQIRQAIRAAGASTLVGAVGLITDSEQARGLV 
QGADEATAAEAmSGPEPBCMAILIARQFLREPEWVFSTARKLGVPVTV^^ 

. SEQ ID No 9 . : " ■/ / 

ATGGCTCTCCCTGACGTCGAAAACACCCCCGCCGCCGGCATCCCCTACTTTACACCAGCACAGAAGCCTCCTGCTGG 
AACAGCTGCCAACCCGCAAACCAGCGGCAATGCCGTCCCCAAGCTGTACACACCTCTGACGGTGCGTGGGGTGACCT 
TCCACAACAGACTTGGCCTCGCGCCGCTCTGCCAGTACTCCGbAGAAGACGGCCACATGACAGACTACCACATCGCG 
.CACTTGGGAGGTATTGCCCAGCGCGGCCCCGGTCTCATGATGATCGAGGCAACCTCCGTCTCACCTGAAGGCAGAAT 
CACGCCGCAGGACGTCGGTTTATGGAAGGACTCGCAGATTGCGCCCATGAAGCGCGTCATCGACTTCGTGCACTCGC 
AGTCCCAGAAGATTGGCGTGCAGATTGCCCACGCCGGCCGCAAGGCTTCGAACATCGCCCCCTGGCTCATGAACAAG 
GGCATCGTCGCGACGGAGAAGGTCGGTGGCTGGCCGGATCGTGTGATCGGCCCGTCCACCGTGCCCTTCCACGAGAC 
TTTCCCCACCCCCAAGGCCATGACCAAGGACGACATCGAGCAGTTCAAGCGCGACTGGTTTGATGCGTGCAAGCGGG 

■ CCATTGCCGCTGGCGCGGACTTCATCGAGATCCACAATGCCCACGGGTATCTTCTCTCGTCTTTCCTATCACCGTCT 
TCCAACACGCGCACCGACGAGTACGGCGGCTCCTTTGAGAACCGCATCCGGCTCTCTCTCGAAATCGCCCAGGTCAC 
CCGTGACGCCGTCGGCCCCAACGTTCCTGTTTTTCTCGGTGTCTCCGCGACGGACTGGATCGAGGAGACCCTCCCCG 
AGGAATCGTGGAAGCTCTCTGACTCCGTCCGCTTCGCCGAAGCCCTCGCTGCCCAGGGCGCTATTGACCTGATCGAC 
GTCTCTTCCGGCGGTGTCCACGCCGCGCAGAAGATCAAGTCCGGGCCGGCTTTCCAGGCTCCCTTCGCTGTGGCTAT 
CAAGAAGGCCGTTGGCGATAAGCTCCTTGTTGCGACGGTGGGCACGATCACGAACGGTAAGCAGGCGAACAAGCTGC 
TTGAGGAGGAGGGATTGGATGTTGCGCTTGTGGGACGTGGTTTCCAGAAGGATCCCGGTCTGGCGTGGACTTTCGCG 
CAGCATCTTGATGTTGAGATTGCGATGGCGAGTCAGATTCGGTGGGGATTCACAAGGCGCGGGGGCACGCCTTATAT 
CGACCCCAAAGCTTATAAGGAGAGCATCTTTGAGTAA 

SEQ ID No 10 

MALPDVENTPAAGIPYFTPAQNPPAGTAANPQTSGNAVPKLYTPLTVRGVTFHNRLGLAPLCQYSAEDGHMTDYHIA 

HLGGIAQRGPGLMMIEATSVSPEGRITPQDVGLWKDSQIAPMKRVIDFVHSQSQKIGVQIAHAGRKASNIAPWLm 

GIVATEKVGGWPDRVIGPSTVPFHETFPTPKAMTKDDIEQFKRDWFDACKRAIAAGADFIEIHNAHGYLLSSFLSPS 

SNTRTDEYGGSFENRIRLSLEIAQVTRDAVGPNVPVFLRVSATDWIEETLPEESWKLSDSVRFAEALAAQGAIDLID 

VSSGGVHAAQKIKSGPAFQAPFAVAIKKAVGDKLLVATVGTITNGKQANKLLEEEGLDVALVGRGFQKDPGLAWTFA 

QHL D VE I AMAS Q IRWGFTRRGGTP YI D PKAYKE S X FE 

SEQ ID No 11 ^ 

ATGACAGTTCCATACCAAGTAAAACCATCAGATGAAATCAAAGGTGCTCCTGAGGTTTCCTATTACACTCCAGAACA 

GCCTGTTCCGGCTGGTACTTTTTATCCCCAATCGTCAGATGAAGTTGCTCCCAAAATTTTTCAACCTTTAAAGATTG 

GTAAGCTTGCTTTGCCAAACAGAATTGGGGTATCTCCAATGTGTCAATATTCTGCTGATTATAATTTTGAAGCAACT 

CCATACGATTTAATCCATTATGGTTCATTAGTGAATCGTGGGCCAGGTATCACCATTGTTGAAAGCACGGCTGTTTC 

TCCTGAGGGTGGATTATCACCTCATGATTTAGGAATCTGGAAGGATGAACAAGCAGAGAAATTGAAACCAATTGTCG 

ATTACGCTCATTCTCAAAAGCAATTAATTGCCATCCAATTGGGCCATGGTGGTAGAAAAGCTTCTGGTCAGCCCTTA 

TTTTTGCACTTGGAACAAGTTGCAGATAAATCTGTCAATGGGTTTGCCGACAAAGCAGTTGCTCCTTCTGCATTGGC 

ATTCAGACCAAATGGTAATTTACGTGTTCCTAATGAGTTGACCAAAGATGAAATCAAACGTGTTGTTAAGGATTTTG 

GTGCTGCTGCTAGAAGAGCTGTTGAAATCAGTGGCTTTGATGCAGTTGAGATTCATGGTGCTCATGGTTATTTGATT 

AATGAGTTCTATAGTCCTATTTCAAACAAGAGAACAGATGAATACGGTGGCAGTTTTGAA2^TAGAACCAGATTTTT 

AAAGGAAGTTATCGATAGTGTTAAATCAAGTATTCCAAACGATGTTCCAGTGTTTTTGAGAATCTCTGCTGCTGAAA 

ATAGTCCTGATCCAGAAGCTTGGACTATTGAAGATTCCAAAAAATTAGCTGACATTTTAGTAGAAAAGGGTATTGCT 

TTGGTTGATGTTTCATCTGGTGGTAACGATTATAGACAACCACCAAGATCTGGGATCAGTAAAGAGTTGAGAGAGCC 

AATCCATGTTCCGTTGTCTCGTGCAATTAAACAACATGTTGGTGACAAGTTATTGGTCAGTTGCGTTGGTGGGCTTG 

AAAAAGATCCTGAATTGCTCAACAAATATTTAGAAGAAGGAACATTTGATCTTGCTTTGATCGGTAGAGGATTTTTA 

AGAAATCCAGGTTTGGTATGGGAGTTTGCCGATAAACTTGGTGTTAGACTCCACCAGGCCTTGCAGTTAGGTTGGGG 

TTTCTGGCCCAACAAACAACAAATTGTTGATTTGATTGAAAGAACATCTAAATTAGAAGTAAATTAG 

SEQ ID No 12 

MTVPYQVKPSDEIKGAPEVSYYTPEQPVPAGTFYPQSSDEVAPKIFQPLKIGKLALPNRIGVSPMCQYSADYNFEAT 
PYHLIHYGSLVNRGPGITIVESTAVSPEGGLSPHDLGIWKDEQAEKLKPIVDYAHSQKQLIAIQLGHGGRKASGQPL 
FLHLEQVADKSVNGFADKAVAPSALAFRPNGNLPVPNELTKDEIKRVVKDFGAAARRAVEISGFDAVEIHGAHGYLI 
NEFYSPISNKRTDEYGGSFENRTRFIiKEVIDSVKSSIPNDVPVFLRISAAENSPDPEAWTIEDSKKLADILVEKGIA 
LVDVSSGGNDYRQPPRSGXSKELREPIIIVPLSRAIKQHVGDKLLVSCVGGIiEKDPELLNKYLEEGTFDLALIGRGFL 
RNPGLVWEFADKLGVRLHQALQIiGWGFWPNKQQXVDLIERTSKLEVN 
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SEQ ID No 13 

ATGGA2VAACAACJ^TACTATACCGGCATTATTTCAACCCATJ^AAGATCAGTGACTCGATCACATT^ 
TGGTGTTTCACCAATGTGCATGTATTCATCGTCACCAACTGACAATCAAGCCACTCTGTTTCATTTTGTTCATTATG 
5 GATCATTTGCTGTACGTGGACCAGCATTAATCATTTTAGAGAGTATCTTTGTGTCCGAAAATTCCGGATTATCCATT 
CATGATT.TAGGTCTTTGGAATGATGATCAAGCTCACAGTTTACGGAAAATTGTTGATTTTATTCATGATCAAGACGG 
AATTTGCTGTATACAATTGAATCACGCTGGGCGAAAGATTGTTGAAGGGGTACCATTCCAACAAATACAACATGGTT 
GGCAAGAACATTGTGTGGGGCCATCTACTGAGCCATTTAGTGATTCACACAATACACCACGAGAATTGACTGTTAAT 
GAAATAAATTCAATTGTGGAAGACTTTGCCAATGCAGCTTGGCGGGCTGTGGAAATCTCAAAATTCGATGCCATTGA 

10 AATACATTGTGCTAATGGATGTTTAATACACCAATTTTTAAGTAAATTGACAAACAAGAGAGCTGACCAATACGGGG 
GCTCATTTGAAAACAGAGTTAGATTTCTTTTACAAATAATTGAGAATATAAAACGAAAGATAGAAACACCGATTTTC 
TTAAAGTTTCCAATGTCAGATAATTGTAGTGATCCGGAAGCGTGGTCTACGGAAGATGCATTGAAGTTGGCCGATCT 
TGTTATTGATTTAGGAGTAAAGGTGATCGACGTTACATCAGGTGGAAATGTTGCGCATTGCAAATCTAGATATCTAT 
TAAATGACGACAAACAACTACCTTCTCAAGTGCCCTTGGCTCGTAAATTGAAAAGCCACATTAGAAACCGATGTTTG 

15 ATCGCATGCAGTGGAGGATTAGATCGAGACATATTTAAACTCGATGAGTTTATTGCTAATGGTGACTTTGATATAGC 
ATTGATAGGTA2\AGGATTTCTCAAAAACACTGGATTGATCAGCCGTATTGCTGACCAATTGCAAGCACAATTCAGAA 
CAGCACCTCAATATAAGTTGGCCTTATCATAA 

SEQ ID No 14 

20 

MENl^TIPALFQPIKISDSITLPNRIGVSPMCMYSSSPTDNQATLFHFVHYGSFAVRGPALIILESIFVSENSGLSI 
HDLGLWNDDQAHSLRKIVDFIHDQDGICCIQLNHAGRKIVEGVPFQQIQHGWQEHCVGPSTEPFSDSHNTPRELTVN 
EINSIVEDFANAAWRAVEISKFDAIEXHCANGCLIHQFLSKLTNKRADQYGGSFENRVRFLLQXIENIKRKIETPIF 
LKFPMSDNCSDPEAWSTEDALKLADLVIDLGVKVIDVTSGGNVAHCKSRYLLNDDKQLPSQVPLARKLKSHIRNRCL 
25 lACSGGLDRDIFKLDEFIANGDFDIALIGKGFLKNTGLISRIADQLQAQFRTAPQYKLALS 

SEQ ID No 15 

ATGGCCGACTTCACCCAGAAGAAGACCTCCTCCCCCGCGGCCCCGGGTGTTCCCrtCTACACCCCGGCCCAGGTCCC 

30 CGCCGCCGGCACTCCCCTCCCCTCCACCCCCGGCGATGTCCCTACTCTCTTCACCCCTCTCAAGATCCGTGGTGTTG 
AGCTCCAGAACCGCTTCGCCGTTGCGCCCATGTGCACCTACTCTGCCGACGATGGCCACATGACCGACTGGCACCTT 
GTCCACCTGGGCTCCTTCGCCCTCCGCGGTGTCCCCCTCACCATCTTCGAGGCCACCGGCGTCCTCCCCAACGGCCG 
CATCACCCCCGAGTGCTCTGGTCTCTGGCAGGACTCCCAGATTGCGCCCCTCAAGCGCATCGTCGACTACATCCACT 
CCCAGGGCCAGAAGGCCGGTATCCAGCTTGCCCACGCCGGCCGCAAGGCCTCCACCAAGGCCCCCTGGCACTACCAG 

35 CGCGGCAAGAGCGAGCTTGCCGGCCCCGAGCAGGGTGGCTGGCCCGAGAACGTCTGGGCCCCCAGCGCCATCAGCTA 
CAACGAGGAGACCTTCCGCTTCCCCAAGGAGATGACCGTCGAGCAGATCCACGAGCTCGTCGAGGCCTGGAAGGCGT 
CTGCCCAGCGTGCCCTCAAGGCCGGCTTCGACCTCATTGAGATCCACGCCGCCCACGGCTACCTCATTTCCGAGTTC 
TTGAGCCCCATCTCCAACCAGCGTACCGACCAGTACGGTGGCTCCTTCGAGAACCGCACCCGCGTTCTCCGCGAGAT 
CATCTCGGCCGTCCGCTCCGTCATCCCCGAGGACATGCCCCTCTTCGTCCGTGTCTCCGCCACCGAGTGGATGGAGT 

40 ACACCGGCCAGCCCTCGTGGGACCTCCAGCAGAGCATTGAGCTCGCCAAGATCCTCCCCGACCTCGGCGTCGACCTC 
CTCGACGTCTCTTCCGGCGGCAACAACAAGGACCAGAAGATCAACGTCCACACCTACTACCAGATCGACATGGCCGA 
GCAGATCCGCGCGGCCGTGCACGAGGCCGGCAAGCAGCTCCTCGTCGGTGCCGTCGGCTTGGTCACCTCGGCTGAGA 
TCGCCAAGGAGACCGTCCAGGAGAAGGAGGATGGCAGAGTCACCATCCAGCGCGAGAACGGCGCCAAGACTCGTGCC 
GATATGGTCCTTGTTGCCAGGCAGTTCTTGAAGGAGCCCGAGTTCGTCCTCACTGTCGCCGACGAGTTGGGTGTTGA 

45 TGTCAAGGCCCCTGTTCAGTACCTCCGTGGTCCTCTTAGCAGCAGGCCCl^GAAGTTGACCACTGTTCCTTAA 

SEQ ID No 16 

MADFTQKKTSSPAAPGVPFYTPAQVPAAGTPLPSTPGDVPTLFTPLKIRGVELQNRFAVAPMCTYSADDGHMTDWHL 
50 VHLGSFALRGVPLTIFEATGVLPNGRITPECSGLWQDSQIAPLKRIVDYIHSQGQKAGIQLAHAGRKASTKAPWHYQ 
RGKSELAGPEQGGWPENVWAPSAISYNEETFPFPKEMTVEQIHELVEAWKASAQRALKAGFDLIEIHAAHGYLISEF 
LSPISNQRTDQYGGSFENRTRVLREIISAVRSVIPEDMPLFVRVSATEWMEYTGQPSWDLQQTIELAKILPDLGVDL 
LDVS S GGNNKDQKIN VHT Y YQI DMAEQ IRAAVHE AGKQLLVGAVGLVT S AE I AKET VQEKE DGRVT I QRENGAKTRA 
DMVLVARQFLKEPEFVLTVADELGVDVKAPVQYLRGPLSSRPKKLTTVP 

55 

SEQ ID No 17 

atggctacttccactacctccgacctcaaactctcccaacccctcaccctccccaatggccttaccctccccaaccg 
cctcgtcaaagccgccatggccgaacaaatgggcttcggcaaccacctgcccaaccccgaactcgccgccgtctacg 

60 ccacctgggcccgcggcgactggggcctgattctcaccggcaacgtccaagtcgaccacgcgcacaagggcgacgcc 
cacgacatcagccccaaccaccccggcaccacgcccgagcagaccgtcacggccttcaaggcctgggcggacgccgc 
gcgcctgaatggccagtccaaaacgcctgtggtcgtgcagatcaaccaccctggtcgccagagtccgatgggcgcgg 
gcacgcggggactgtgggagaaggcggtggcgccctcgccggtgccgttggtgttgggagaggcgtttgtgcctcgc 
ttgttgtcgaaagtgcttttcggcacgccgcgggagctgacggttgcggagatcaaggatatcgtgcaaaagtttgc 

65 ggtgacggcgaggatcacggccgaggccgggttcaatggcgtggagatccatgcggcgcatggatacctgttggcgc 



agttcttgagcaagaagacaaacaggcgcggggatgagtatggcgggtcggc-tgagaacagggcgaggattgttggg 
.gagattattaaggagtgcaggaggcaggtgactgaggcggtgggtgaagaggaggcgaagaagtttgtggtgggaat 
ca'agctgaacagtgcggattggcaggcgggacgcgatggaaaggaggaggaggagacggatacggcggaggaggtgt 
tgaagcagattgagctttt.tgagcagtgggggatcgactttgtcgaggttagcggtggcagttatgaggatcc'tcag 
gtaagttttggtgttgtttgagggatggggcaaggggttgtctgtcgtgaacaacaaaaggggcacggaacaaatgc 
taacgccatacagatggccaacggtcccaagcccgaaaagtccgaacgcaccatggcccgcgaggccttcttcctcg 
agttcgccaagatcatccgcaccaagttccccaagcttcctctcatggtcaccggcggcttccgcactcgtcagggc 
atggaggccgctttggaatccgatgattgcgacatgatcggtatcggacgcccggccatcatcaacccttcgcttcc 
cgccaacttgatcctcaacccggaggtgccggatgcggatgcccgcttgttcgacaagaagagggctgagccgcact 
ggatcgttgagaagttgggcatgaagtccattgttggtgctggtgttgaggtggtacgtcacgttccaaccccattt 
gcttcattgtgtttccgagtatgtcatgctgacttggttcttttctagacgtggtatgtgagcgagctcaagaagct 
ggccaagttttag 



SEQ ID No 18 

ATGGCTACTTCCACTACCTCCGACCTCAAACTCTCCCAACCCCTCACCCTCCCCAATGGCCTTACCCTCCCCAACCG 
CCTCGTCAAAGCCGCCATGGCCGAACAAATGGGCTTCGGCAACCACCTGCCCAACCCCGAACTCGCCGCCGTCTACG 
CCACCTGGGCCCGCGGCGACTGGGGCCTGATTCTCACCGGCAACGTCCAAGTCGACCACGCGCACAAGGGCGACGCC 
CACGACATCAGCCCCAACCACCCCGGCACCACGCCCGAGCAGACCGTCACGGCCTTCAAGGCCTGGGCGGACGCCGC 
GCGCCTGAATGGCCAGTCCAAAACGCCTGTGGTCGTGCAGATCAACCACCCTGGTCGCCAGAGTCCGATGGGCGCGG 
GCACGCGGGGACTGTGGGAGAAGGCGGTGGCGCCCTCGCCGGTGCCGTTGGTGTTGGGAGAGGCGTTTGTGCCTCGC 
TTGTTGTCGAAAGTGCTTTTCGGCACGCCGCGGGAGCTGACGGTTGCGGAGATCAAGGATATCGTGCAAAAGTTTGC 
GGTGACGGCGAGGATCACGGCCGAGGCCGGGTTCAATGGCGTGGAGATCCATGCGGCGCATGGATACCTGTTGGCGC 
AGTTCTTGAGCAAGAAGACAAACAGGCGCGGGGATGAGTATGGCGGGTCGGCTGAGAACAGGGCGAGGATTGTTGGG 
GAGATTATTAAGGAGTGCAGGAGGCAGGTGACTGAGGCGGTGGGTGAAGAGGAGGCGAAGAAGTTTGTGGTGGGAAT 
CAAGCTGAACAGTGCGGATTGGCAGGCGGGACGCGATGGAAAGGAGGAGGAGGAGACGGATACGGCGGAGGAGGTGT 
TGAAGCAGATTGAGCTTTTTGAGCAGTGGGGGATCGACTTTGTCGAGGTTAGCGGTGGCAGTTATGAGGATCCTCAG 
ATGGCCAACGGTCCCAAGCCCGAAAAGTCCGAACGCACCATGGCCCGCGAGGCCTTCTTCCTCGAGTTCGCCAAGAT 
CATCCGCACCAAGTTCCCCAAGCTTCCTCTCATGGTCACCGGCGGCTTCCGCACTCGTCAGGGCATGGAGGCCGCTT 
TGGAATCCGATGATTGCGACATGATCGGTATCGGACGCCCGGCCATCATCAACCCTTCGCTTCCCGCCAACTTGATC 
CTCAACCCGGAGGTGCCGGATGCGGATGCCCGCTTGTTCGACAAGAAGAGGGCTGAGCCGCACTGGATCGTTGAGAA 
GTTGGGCATGAAGTCCATTGTTGGTGCTGGTGTTGAGGTGACGTGGTATGTGAGCGAGCTCAAGAAGCTGGCCAAGT 
TTTAG 



SEQ ID No 19 

mTSTTSDIrKLSQPLTLPNGLTLBNRLVKAAMAEQMGFGNHLPNPELAAVYATWARGDWGLILTGNVQVDI^ 

HDISPNHPGTTPEQTWAFKAWADAARLNGQSKTPVWQINHPGRQSPMGAGTRGLWEKAVAPSPVPLVLGEAFV 

LLSKVLFGTPRELWAEXKDIVQKFAVTARITAEAGFNGVEIHAAHGYLLAQFLSKKTNRRGDEYGGSAENRARI^ 

EIIKECRRQVTEAVGEEEAKKFWGIKLNSADWQAGRDGKEEEETDTAEEVLKQIELFEQWGIDFVEVSGGSYEDPQ 

MANGPKPEKSERTMAREAFFLEFAKIIRTKFPKLPLMVTGGFRTRQGMEAALESDDCDMIGIGRPAIINPSLPANLI 

LNPEVPDADARLFDKKRAEPHWIVEKLGMKSIVGAGVEVTWYVSELKKLAKF 



SEQ ID No 20 

atgtcggcagaaaagaagactttgagcaaaccggccgccggggtgccttactacaccccagcccaggagccgccggc 
agggacccctttgcagcagcaggacgccatcccaacgctgttcaagcctctgaagatccgtggcgtcgagctctcca 
accgctttggcgtctcgcccatgtgcacctactcagccgacgatggccacctgaccgacttccacttggtgcacctg 
ggccagttcgccctgcacggcacggccctgaccattgtcgaggccacatccgtcacgcccaacggacgcatctcgcc 
cgaggacagcggcctgtggcaagacagccagatcgctcctctgcgccgcatcgtcgactacgtgcacagccagggcc 
aaaagatcgccatccaactggctcatgccggccgcaaggccagcacaaaggccccctggcacgactccttcaccccc 
agcggcgagtataagccgagagagggcttacaggtcgtcggacccgagtatggcggctggcctgatgacgtctgggc 
cccgagcgccatcccgttctcggaggactttccgaaccccaaggagatgaccgttgaggagattgagggactcgtca 
ccagctttgtggacgctgccaagcgtgccatcgaggccggcgtcgacattattgagattcacggcgctcacggttac 
ctgatcaccgagttcctttcgccgctatcaaacgtaagtggagatactttgtgtggggctgtgcgcatactccctcg 
ggtgtgacttctattaacattttatttcctggcacgcagaaacggacagacaagtacggcggcagctttgagaaccg 
cacccgggtcctgatcgatattatcaaggccgtccgggcagtgattcccgaggagatgccactctitcgtccgaatct 
ccgcgaccgaatggatggagtacgccggcgagcctagctgggacctcgagcagagcacacagcttgccaagctcctic 
Gcggacctgggtgtcgacctgctcgacgtcagctcgggcggaaactcggtggcccaaaagatcgagctcacgccgta 
ctaccagatcgacctggcagccaagatccgGgaggccgtcggcgataggttgctcataggcgcggtcggcaacatca 
acacggctgacattgcgcgcgatgtcgtggatgagcagggcgccgagaaggtggccgaggccaagcagacgcatgac 
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accatcgaggtcgtgagcgaatcacatggcggcaagaccaaggcggatctggtcctcattgctcgccagttcctgcg 
cgagcctgagtttgtgctgaggacggcgcataaccttggggtcaatgtgcagtggcctcaccaataccacagagcag 
tgtggcgcaagggtgcaaggatttga 

5 

SEQ ID No ,21 

ATGTCGGCAGAAAAGAAGACTTTGAGCAAACCGGCCGCCGGGGTGCCTTACTACACCCCAGCCCAGGAGCCGCCGGC 
AGGGACCCCTTTGCAGCAGCAGGACGCCATCCCAACGCTGTTCAAGCCTCTGAAGATCCGTGGCGTCGAGCTCTCCA 

10 ACCGCTTTGGCGTCTCGCCCATGTGCACCTACTCAGCCGACGATGGCCACCTGACCGACTTCCACTTGGTGCACCTG 
GGCCAGTTCGCCCTGCACGGCACGGCCCTGACCATTGTCGAGGCCACATCCGTCACGCCCAACGGACGCATCTCGCC 
CGAGGACAGCGGCCTGTGGCAAGACAGCCAGATCGCTCCTCTGCGCCGCATCGTCGACTACGTGCACAGCCAGGGCC 
AAAAGATCGCCATCCAACTGGCTCATGCCGGCCGCAAGGCCAGCACAAAGGCCCCCTGGCACGACTCCTTCACCCCC 
AGCGGCGAGTATAAGCCGAGAGAGGGCTTACAGGTCGTCGGACCCGAGTATGGCGGCTGGCCTGATGACGTCTGGGC 

15 CCCGAGCGCCATCCCGTTCTCGGAGGACTTTCCGAACCCCAAGGAGATGACCGTTGAGGAGATTGAGGGACTCGTCA 
CCAGCTTTGTGGACGCTGCCAAGCGTGCCATCGAGGCCGGCGTCGACATTATTGAGATTCACGGCGCTCACGGTTAC 
CTGATCACCGAGTTCCTTTCGCCGCTATCAAACAAACGGACAGACAAGTACGGCGGCAGCTTTGAGAACCGCACCCG 
GGTCCTGATCGATATTATCAAGGCCGTCCGGGCAGTGATTCCCGAGGAGATGCCACTCTTCGTCCGAATCTCCGCGA 
CCGAATGGATGGAGTACGCCGGCGAGCCTAGCTGGGACCTCGAGCAGAGCACACAGCTTGCCAAGCTCCTCCCGGAC 

20 CTGGGTGTCGACCTGCTCGACGTCAGCTCGGGCGGAAACTCGGTGGCCCAAAAGATCGAGCTCACGCCGTACTACCA 
GATCGACCTGGCAGCCAAGATCCGCGAGGCCGTCGGCGATAGGTTGCTCATAGGCGCGGTCGGCAACATCAACACGG 
CTGACATTGCGCGCGATGTCGTGGATGAGCAGGGCGCCGAGAAGGTGGCCGAGGCCAAGCAGACGCATGACACCATC 
GAGGTCGTGAGCGAATCACATGGCGGCAAGACCAAGGCGGATCTGGTCCTCATTGCTCGCCAGTTCCTGCGCGAGCC 
TGAGTTTGTGCTGAGGACGGCGCATAACCTTGGGGTCAATGTGCAGTGGCCTCACCAATACCACAGAGCAGTGTGGC 

25 GCAAGGGTGCAAGGATTTGA 

SEQ ID No 22 

MSAEKKTLSKPAAGVPYYTPAQEPPAGTPLQQQDAIPTLFKPLKIRGVELSNRFGVSPMCTYSADDGHLTDFHLVHL 
30 GQFALHGTALTIVEATSVTPNGRISPEDSGLWQDSQIAPLRRIVDYVHSQGQKIAIQLAHAGRKASTKAPWHDSFTP 
SGEYKPREGLQWGPEYGGWPDDVWAPSAIPFSEDFPNPKEMTVEEIEGLVTSFVDAAKRAIEAGVDIIEIHGAHGY 
LITEFLSPLSimiTDKyGGSFENRTRVLIDIIKAVRAVIPEEMPLFVRISATEWMEYAGEPSWDLEQSTQLAKLLPD 
LGVDLLDVSSGGNSVAQKIELTPYYQIDLAAKIREAVGDRLLIGAVGNINTADIARDWDEQGAEKVAEAKQTHDTI 
EWSESHGGKTKADLVLIARQFLREPEFVLRTAHNLGVNVQWPHQYHRAVWRKGARI 

35 

SEQ ID No 23 

ATGACTATTGTTAATGAAGGAGCCGAAAATGTTGGTTATTTTACACCTGCGCAAAAAATACCAGCTGGAGCGGCGAT 
4 0 AGGTGTACCGCAAACAAAATTATTTACTCCTCTTAAAATTAGAGGAGTGGAGTTCCATAACAGAATGTTTGTTTCGC 
CGATGTGCACTTATTCCGCTGACCAAGAAGGGCATTTGACAGATTTTCACCTAGTACATCTTGGAGCGATGGGAATG 
CGTGGGCCTGGCCTTGTAATGGTAGAAGCGACAGCGGTTTCCCCAGAGGGACGAATTTCACCTAATGATTCAGGATT 
ATGGATGGAGTCGCAAATGAAGCCGTTACGAAGAATTGTTGAATTTGCTCATTCGCAAAATCAAAAAATTGGGATTC 
AATTGGCGCATGCTGGTAGAAAGGCTAGCACCACTGCTCCTTATCGAGGATACACAGTTGCGACTGAAGCTCAAGGT 
45 GGGTGGGAGAATGATGTTTATGGACCAAATGAAGACAGGTGGGACGAAAACCACGCTCAACCTCATAAGTTAACTGA 
AAAGCAATATGATGAATTAGTGGATAAGTTTGTTGTTGCTGCGAAGCGTGCAGTTGAAATAGGTTTTGATGTAATTG 
AAATTCATGGCGCTCATGGTTATCTTATATCGTCAACAGTTAGTCCTGCCACTAATGACCGCAATGACAAGTATGGT 
GGGACATTTGAGAAACGTATTTTGTTTCCTATGGAAGTTGTCCATTCTGTTCGTAAAGCSATTCCAGATAGTATGCC 
CTTGTTTTATAGAGTAACGGCTACAGATTGGTTGCCCAAAGGACAAGGATGGGAGATAGAAGATACAGTTGCATTAG 
50 CAGCGAGGCTTCGCGATGGTGGTGTTGACTTGATAGATGTTAGCTCTGGTGGTAATCACAAGGATCAAAGAATTGAG 
GTGAAGGATTGCTATCAAGTTCCTTTTGCGGAAAAGATTAAGGATCAAGTGAATGGAATACTACTTGGCGCTGTCGG 
AATGATCAGGGATGGTCTTACGGCGAATGAAATCCTAGAAAGTGGAAAAGCTGATGTTACTTTTGTCGCAAGGGAGT 
TCTTAAGGAACCCGTCGTTGGTGCTAGACAGCGCGAACCAGTTGGGTGAAAATGTTGCATGGCCAGTTCAGTATGAC 
TATGCAGTTAAGGGACACAGAAAGTTACGTTGA 

55 

SEQ ID No 24 

MTIVNEGAENVGYFTPAQKIPAGAAIGVPQTKLFTPLKIRGVEFHFTNRMFVSPMCTYSADQEGHLTDFHLVHLGAM 
GMRGPGLVMVEATAVSPEGRISPNDSGLWFTMESQMKPLRRIVEFAHSQNQKIGIQL^^ 
60 EAQGGWENDVYGPFTNEDRWDENHAQPHKLTEKQYDELVDKFWAAKRAVEIGFDVIEIHGAHGYLISSTVSPAFTT 
NDRNDKYGGTFEKRILFPMEWHSVRKAIPDSMPLFYRVTATDWLPKGQGWEIEDTVAFTLAARLRDGGVDLIDVSS 
GGNHKDQRIEVKDCYQVPFAEKIKDQVNGILLGAVGMIRDGLFTTANEILESGKADVTFVAREFLRNPSLVLDSANQ 
LGENVAWPVQYDYAVKGHRKLR 

65 SEQ ID No 25 
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CG^^GCTGGACCCAAACAAACAGCTGACCCTCTCCTTGACAACAAAGCCGGCCATCCTCGCCGACGATTGCCTCTA 
CCCCCGCATAGTGAGAGTGGGAGGTGGGTTCTGGGAGGGTGAAACAGAGAGGATGAGGGGGAGGGGGAAGAAGGGCG 
- CGGGGGGTGTGGCGTTTTAGACCGGGGGCGAGGAGGCTGGGGCGGGAAGGGCAGTCGACGGGAGGAGGGGTGCGAGG 
5 CTGTTGAAGGCCCTCCGCATGGGCGACCTGACGATGAACA?\CGGGATGTGGGTGAGGGGGATGTGGGAGTAGTGGGG 
GGAGAATGGGGAGGGGAGGGAGTAGGAGCTGGTGGACGTGGGGGAGTTCGGGGTGCAGGGGGGGGCGCTGTGGATGG 
TGGAGGGCAGGGGGGTGGAGGGTGGTGGCGGGATGTGGCGGGAGGATGTGGGTTTGTGGGAGGAGTGGGAGATTGGG 
GGGGTGAAGGGGATGGTGGAGTTTATGGAGTCGGAGAACGAGGTGGGGGGGATGGAGGTGGGGGAGGGGGGTGGGAA 
GGGTAGGAGGCTGGCAGGGTGGATCAGGGAGGGTCGGGGCAAGGGGGTGGGTCAGGAGAGGGAGAAGGGCTGGCGGG 
10 AGGAGGTTGTGGGTGGGAGCGGGATTCCTTAGACGAAGGACTGGGCCAGAGGGCGTGAGTTGACTACCGAGGRRGTC, 
GAGGGTGTGGGTGAAGAAGTTGGGGGAGTGGGGCAAGAGGTCAAATGGAGGTGGTTTTGAGGTCATTGAGATGCACG 
CCGCTGA 



15 SEQ JD No 26 

ATGAGGGGCAGGGGGAAGAAGGGGGGGGGGGGTGTGGGGTTTTACAGGGGGGGGGAGGAGGGTGGGGGGGGAACGGC 
AGTGGAGGGCAGGAGGGGTCCGAGGGTGTTGAAGGGGCTGGGCATGGGGGAGGTGAGGATGAAGAAGGGGATGTGGG 
TGAGGGGCATGTGGGAGTAGTGCGGGGACAATGGGGAGGGGAGGGAGTAGGAGGTGGTCGAGGTGGGGCAGTTCGGG 

20 CTGGAGGGGGCGGGGGTGTGGATGGTGGAGGGCAGGGGCGTGGAGGCTGGTGGGGGGATGTGGGGGGAGGATGTGGG 
TTTGTGGCAGGAGTGGGAGATTGGGGGGCTGAAGGGCATGGTCGAGTTTATGGAGTGGGAGAAGCAGGTGGCGGCCA 
TGGAGGTCGCGGAGGGGGGTGGGAAGGGTAGCAGCGTGGGACGGTGGATCACGGAGGGTGGCGGGAAGGCGCTGGGT 
CAGGAGAGGGAGAACGGGTGGGGGGACGACGTTGTGGGTGGGAGCGGGATTCCTTACACCAAGGACTGGGCCACACC 
GGGTGAGTTGACTAGGGAGGRGTGGAGGGTGTGGGTGAAGAAGTTCGCGGAGTGGGCCAAGAGGTCAAATCGAGCTG 

2 5 GTTTTGAGGTCATTGAGATCCACGCGGCT 

SEQ ID No 27 

MTGTANKAJ^PGVPFYTPAQEPPAGTPVDASTAPTLFKPLRIRDLTINNRIWVSPMCQYSADNGHATDYHLVHLGQFA 
30 LHGAALSMVEATAVEARGRISPEDVGLWQDSQIAPLKRIVDFIHSQNQVAAIQLAHAGRKASTLAPWITEARGKALA 
QESENGWPDDWAPSAIPYTKDWATPRELTTEXSRVWVKKFAESAKRSNRAGFDVIEIHAA 

SEQ ID No 28 

35 GAAGTGGTGTAGATGTGGTTGAATTGGTATATTAGAGGGGAGTAGTGTATATGGGAGAGAGTATACATTGAAGTTGC 
GAAGGTTGTTGGAGATTGATTAATGATGGGTTAGGAGATAATGGACAACGTTGCGGCTGAAGGGGTTCCATATTACA 
CAGCGGGTGAAGAGCGGGGAGGTGGTAGGGAGAGAAGGGGCTCAACGAAGCTATTCACAGCCATCACCATCCGCGGC 
GTCACATTCCCAAACCGCCTCTTCGTTGCCCCTCTCTGCCAATACTCCGCCAAAGATGGTTATGCCACTGATTGGCA 
CTTGACTCAGCTGGGGGGAATAATCCAAAGAGGCGGGGGATTGTCGATGGTGGAGGCTACCGCTGTAGAAAACCACG 

40 GTGGCATCACACCTCAGGATGTTGGTCTGTGGGAAGAGGGCCAGATGGAGGGTGTGAAGGGCATGAGGACTTTGGGG 
CACAGTGAGAGCCAGAAAATTGGTATCCAGCTGTGGGATGGGGGTGGCAA.GGGGAGTTGGGTATGTGGGTGGGTAAG 
GGTAaATGGTGTGGGGGGGGAAGAAGTGGGTGGGTGGCGAGAGAATATCGTTGGTGGGTGGGGGATCGGAGAAGAAA 
ATGGTGTGAAGGCAGTTGGGAAGGGTTTGAGGAAGGAGGATATAGAGGAACTCAAGAGGGACTACGTGGAAGCGGCA 
AAACGAGGGATGGATGGTGGTTTGGATGTTATGGAAATTGATGCAGGTGATGGATATGTAGTGGATGAATTCTTGAG 

4 5 TCCGGTAAGdAATCAAAGAACCGACGAGTATGG 

SEQ ID No 29 

ATGGCTTACGAGATAATGGAGAAGGTTGCGGCTGAAGGGGTTCCATATTACACAGGGGCTCAAGACCCGGCAGGTGG 
50 TAGGGAGAGAAGGGGGTGAACGAAGGTATTCACACCCATCAGCATGCGCGGGGTCACATTCCCAAACCGCGTGTTCC 
TTGCCCGTCTCTGCGAATACTCCGCCAAAGATGGTTATGGGACTGATTGGCACTTGACTGAGCTCGGGGGAATAATC 
GAAAGAGGGCCCGGATTGTGCATGGTGGAGGCTACCGCTGTAGAAAACCACGGTCGCATGACACCTCAGGATGTTGG 
TCTGTGGGAAGACGGGCAGATGGAGGGTCTGAAGGGCATGAGGAGTTTCGCGCACAGTGAGAGGGAGAAAATTGGTA 
TCCAGCTGTCGCATGGGGGTGGGAAGGCGAGTTGGGTATGTGGGTGGGTAAGCGTAAATGGTGTGGGGGCGGAAGAA 
55 GTGGGTGGGTGGGGAGAGAATATCGTTGGTGGGTGGGGGATGGCAGAAGAAAATGGTGTGAAGGGAGTTGGCAAGGG 
TTTGAGGAAGGAGGATATAGAGGAAGTGAAGAGGGACTACGTGGAAGGGGGAAAAGGAGGGATGGATGGTGGTTTGG 
ATGTTATGGAAATTGATGGAGGTGATGGATATGTAGTGGATGAATTGTTGAGTGGGGTAAGGAATGAAAGAAGGGAG 
GAGTATGG 

60 SEQ ID No 30 

MAYEIIDNVAAEGVPYYTPAQDPPAGTQTSGSTKLFTPITIRGVTFPNRLFLAPLCQYSAKDGYATDWHLTHLGGII 
QRGPGLSMVEATAVQNHGRITPQDVGLWEDGQIEPLKRITTFAHSQSQKIGIQLSHAGRKASCVSPWLSVNAVAAEE 
VGGWPDNIVAPSAIAQENGVNPVPKAFTKEDIEQLKSDYVEAAKRAIHAGFDVIEIHAAHGYLLHQFLSPVSNQRTD 
65 EY 
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SEQ ID No 31 

TTTGGATGGTAT2^TAATAATTCTATTTGTGAAACATACGGGGCTGGTCTTGATCAAGAACGGTCCATCTATGGTCT 
5 ATAAAGAACTCTTGTTCACTTTCTTTCCACGTCCCTTGAAGCTCCAATCAATCTGGTTCGCCATCTTGACCTCCACG 
CCAAGCTCGTTAGCAAAAGCTCGAACCAGACCAGGATTCTGTTGGAACCAACGTCCAGCCCTCACAATGTCGATACC 
AGATTGCAAAACCTCTTCAGCAAGATGTCCAGTCTTGATTCCACCTACTGCTGAAACAAGTACACTATCGCCAACAG 
CCTTCTTTACCTGTTTGGCGAGGTCTACCTGGTAAGCAGGACCGGACTTGATGGCGATGGCGGACTTAGGATGGATA 
CCGCCTGAGCTGACGTCCACCAAGTCTACTCCATGCTTGGGCAAGATACGCGCGAGTTGACAAGTCTGCTCGACTGT 
10 CCAGCTTTCAGGAAACTCGTCTTTGAATTGAGAGTCAAACTCGAACCAATCAGTTGCACTGACACGAACGAGGACAG 
GTGTAGTTTCGGGGATGGCAGCGCGGATGAGGTCAAGGATTTCCAAGACAACTCTGATACGGTTCTCGAAACTGCCA 
CCATACTCGTCGGTT 

SEQ ID No 32 

15 

AACCGACGAGTATGGTGGCAGTTTCGAGAACCGTATCAGAGTTGTCTTGGAAATCCTTGACCTCATCCGCGCTGCCA 
TCCCCGAAACTACACCTGTCCTCGTTCGTGTCAGTGCAACTGATTGGTTCGAGTTTGACTCTCAATTCAAAGACGAG 
TTTCCTGAAAGCTGGACAGTCGAGCAGACTTGTCAACTCGCGCGTATCTTGCCCAAGCATGGAGTAGACTTGGTGGA 
CGTCAGCTCAGGCGGTATCCATCCTAAGTCCGCCATCGCCATCAAGTCCGGTCCTGCTTACCAGGTAGACCTCGCCA 
20 AACAGGTAAAGAAGGCTGTTGGCGATAGTGTACTTGT-TTCAGCAGTAGGTGGAATCAAGACTGGACATCTTGCTGAA 
' GAGGTTTTGCAATCTGGTATCGACATTGTGAGGGCTGGACGTTGGTTCCAACAGAATCCTGGTCTGGTTCGAGCTTT 
TGCTAACGAGCTTGGCGTGGAGGTCAAGATGGCGAACCAGATTGATTGGAGCTTCAAGGGACGTGGAAAGAAAGTGA 
ACAAGAGTTCTTTATAG 

25 SEQ ID No 33 

TDEYGGSFENRIRWLEILDLIRAAIPETTPVLVRVSATDWFEFDSQFKDEFPESWTVEQTCQLARILPKHGVDLVD 
VSSGGIHPKSAIAIKSGPAYQVDLAKQVKKAVGDSVLVSAVGGIKTGHLAEEVLQSGIDIVRAGRWFQQNPGLVRAF 
ANELGVE VKMANQI DWS FKGRGKKVNKS S L 

30 

SEQ ID No 34 

AGGAAGTTGCATGTCACTTGTAGTGACAGGGCGTCGTGTAAATTTTATAAATACCTATACTTGTTTGTTCACTTCTA 
TGCTACTCATATCAATCCGAGAAGATCAAACAGTCCCCTATACACACTTGTCAAGACCTATCTATTATTTCAAAAAT 
CAGCAATATGGCTGAGACAATGCCTAAGTGTGAGGCAAATGGCCATCACAAAATCATCATCAATAAGGAAGCTCCGA 

35 ATGTTCCTTTCTATACTCCAGTGCAAGATCCACCAGCAGGAACGTCTTACGATGTTCAGCCTGAAGGAAGCCTATTC 
TCTCTTATTAAAATAAGAAACCTGACTCTTCAAAACCGGATTTTTGTCTCCCCAATGTGTCAATATTCAGCAAAGGA 
TGGTGTCATGACCCCCTGGCACAAACAACACCTGGGCAGCTTCGCAGCACGAGGTCCGGGTCTCATTGTCACAGAAG 
TCAACGCAGTTTCACCAGAGGGACGAATCAGTCCTGAGGATGCAGGCATCTACGATGATGGGCAGCTTGGACCTCTC 
CGGGATATTGTGGACTTTGTACACAGCCAGGGCGCCAAGATTGCTATTCAGATAGGTCATGCTGGGAGAAAAGCGAG 

4 0 CACAGTCGTACCGTGGCTGGACCGCAAGAACACTGCTTTTA 



SEQ ID No 35 

45 MPKCEANGHHKIIINKEAPNVPFYTPVQDPPAGTSYDVQPEGSLFSLIKIRNLTLQNRIFVSPMCQYSAKDGVMTPW 
HKQHLGSFAARGPGLIVTEVNAVSPEGRISPEDAGIYDDGQLGPLRDIVDFVHSQGAKIAIQIGHAGRKASTVVPWL 

DRKNTAF 

SEQ ID No 36 

50 

GCACGAGGGATTATTGACAACATCGCGGCTGAAGGGGCTCCCTACTACACGCCTGCTCAAGACYCTCCAGCAGGCAC 
ACAGACCAGCGGCTCAACCAAGGTTTTCACACBCATCACCATCCGAGGCGTCACATTCCCAAACCGTCTCTTTCTTG 
CCCCTCTCTGTCAATACTCCGCCAAAGATGGATATGCTACTGATTGGCACTTGACTCATCTCGGAGGCATTATCCAA 
CGAGGCCCGGGACTGTCCATGGTAGAGGCCACCGCTGTTCAAAACCACGGTCGCATCACGCCTCAGGACGTTGGTCT 

55 CTGGGAAGATGGACAAATCGAGCCCTTTGAAGCGCATCACTACTTTTGCCCACAGCCAAAGCWCAGA?^GATTGGTAT 
TCAGCTCTCGCACGCTGGTCGTAAGGCTAGTTGTGTATCTCCGTGGTTGAGCATCAACGCTGTTGCCGCTAAGGAAG 
TCGGTGGCTGGCCAGACAACATTGTTGCTCCTTCTGCCATCGCACAAGAAGCTGGCGTGAACCCTGTTCCCAAGGCC 
TTCACCAAGGAGGATATCGAGGAACTCAAGTUITGACTTTCTGGCTGCAGCMAAACGAGCCAWCCGCGCTGGTTTTGA 
TGTCATCGAGATCCATGCAGCTCATGGATACKTGCTTCACCAGTTCTTGAGTCCAGTCAGTAACCAAAGAACCGATG 

6 0 AGTATGGTGGCAGCTTCGAGAACCGTATCAGAGTCGTCTTGGAGATCATTG 

SEQ ID No 37 

GCACGAGGGATTATTGACAACATCGCGGCTGAAGGGGCTCCCTACTACACGCCTGCTCAAGACYCTCCAGCAGGCAC 
ACAGACCAGCGGCTCAACCAAGGTTTTCACACBCATCACCATGCGAGGCGTCACATTCCCAAACCGTCTCTTTCTTG 
65 CCCCTCTCTGTCAATACTCCGCCAAAGATGGATATGCTACTGATTGGCACTTGACTCATCTCGGAGGCATTATCCAA 
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^^^^S^^^^^'^^^^^^g^'^cccttgaagcgcatcactacttttgcccacagccaaagccagKa^^ 

AGCTGTCGGACGCTGGTCGTAAGGCTAGTTGTGTATCTCCGTGGTTGAGCATCAACGCTCTTGCTO?T^^ 

ggtggctggccagacaacattgttgctccttctgccatcgcacaagaagctggcgtgSccctSSS^g^ 
caccaaggaggatatcgaggaactcaagaatgactttctggctgcagcmaaacgagcc^cSSctS^^ 
tcatcgagatccatgcagctcatggatacktgcttcaccagttcttgagtccagtcagScSSS^ccgI^^^ 
tatggtggcagcttcgagaaccgtatcagagtcgtcttggagatcattg ■ 

SEQ ID No 38 - ' ■ 

argiidniaaegapyytpaqdxpagtqtsgstkvftxitirgvtfpnrlflaplcqysakdgyatdwhlthlggiio 

^^Mo^:f.r'^''''''^'''''''''''^^«2DVGLWEDGQIEPlKRITTFAHSQSQKIGIQLSHAGRKA^^ 
SEQ ID No 39 

TGGCACCATGGTTAAGCGGCGGCGATGTTGCTGGTGAGGACGTCAACGGATGGCCACAGGATGTCTGGGO^^^ 

gcgattccatggaacgagaagcacgctgtcccaaaggagatgtcgttgStStmcS?S^ 
tggagaggcggtcaagcgggcattgaaggctggatttgatgttattgagattcacaa?gctScg^^ 

J?G=?S?S?c^J?S^^^^ 
SEQ ID No 

f^^^^^oo^;^^^^^^°^^°^^^^°'^^^'^^™^™^<^^^2^GPGLTCVEATAVTPQGRITPEDVGIWQDSQ 

AKVVEFAHSQNQKIMIQIAHAGRKASTVAPWLSGGDVAGEDVNGWPQDVWAPSAIPWNEKHAVPKEMSLDD^ 

AFGEAVKRALKAGFDVIEIHNAHGYLLHEFICLRATPGPTSTGGSWENRTRLT^S^fSjH 

SEQ ID No 41 

gaaaaccgtacccgtctactcattgaaatcgtaacagccgtccgagccgcgatgccctcSgS?Sc?c^ 

CCQCCTCTCCTCTACAGi^TGGATGGAAGATACCGACATCGGCAAGAAGTTCGGAAGCTGGSTO^ 
SEQ ID No 42 

J!^5?^^^^f^^!f;!f,^YY^fSGGEDFTWDERSSS 

S^SJgJL'JSgSSSJ^ 

SEQ ID No. 

cgagcggttcgacatgttttccaagctcgccgccgccgccaaggagcacggcagcctScgtcgScaggtc^^^ 

accccggtcgccaggcccgcggcagcgtccagcagcaccccattagcgccagcSStSSgS?S^gS^^^ 

tttgggtcaaagtttggcgtgcccaggcccgctaccaaggaggatattaaggcggtgattgaSg^^ 

ggccgagtaccttgaaaaggccggtttcgacggtatcgaattgcacgccgcccacggttaS?gctgg?c?S?S? 

tgtccgaaacaaccaaccagcgcaccgacgagtacggcggcagcctcgaaaaccgcatgHggc^Stc^^ 

acggccgaggtccgcaggcggacgagcaagaatttcatcctcggcatcaaaattSgStc^^ 

gggtttcaagccagaggaggcggtgcagttgtgcgaggccctcgaggcSg^SJJSSJ^^^ 

gcggcacctatgagagttttggttttgcgcaccgcaaggagtccagccgcaagcgggagSctS^ 

^^^.^^.^'''''''''''''^^^^^^^^'^^^^'^'^^'^'^tctacaccaccggcggcttgaaSSct^ 

S?^^^^^^^°c^'==s=<=tcgatgggataggcatcgggcgcgcagccggttcggagccggacctScSSS 

TCGCGGGCAAGGTGTCCAGCATTATCAAATACGCCATGGGGGAGGACGAGTTTGTGCTGciSTGJc^^^ 
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GCGCAAATAAGGCTGATGGCCAAGGGCGAGGAGCCGTTTGACATCTCAAACGCCGACGAGGTGGCGCGGGTGACGCA 
GTTGATGGCGGAGGGCAAGGTG 



5 SEQ ID No. 44 

MSPPRFEAAPADPSPLGTPLKYPVSGRSAPNRFLNAAMSEGLATFDEADPSKRGIPTEQLVQLYKRWGQGEWGQIQT 
GNVMIDPEHLEAPGNMWPRDAEPSGERFDMFSKLAAAAKEHGSLIVAQVGHPGRQARGSVQQHPISASDVQLKQEM 
FGSKFGVPRPATKEDIKAVIEGFAHTAEYLEKAGFDGIELHAAHGYLLAQFLSETTNQRTDEYGGSLENRMRLILEV 
TAEVRRRTSKNFILGIKINSVEFQEKGFKPEEAVQLCEALEAAGMDFVETSGGTYESFGFAHRKESSRKRENYFIEF 
10 AEVIRKAVKHMVVYTTGGFKTVGAMVDALQGVDGIGIGRAAGSEPDIAKDIIAGP^ 
AQIRIJytAKGEEPFDISNADEVARVTQLMAEGKV 

SEQ ID No. 45 

AGCTTAGACCTACAGAGAGCATTGCTACTGTAAGTTGTATTTCGCCTTCTCGCATAGAACAAAATATAACTGATGGT 
15 GTAGGTATAAAACTAGCATCCTCTTCCACCTTTCAGATCCCCCTGACAAGCACCTTATGGCTTTCGATGGAAACAGC 
TATTCCTTCTACTGGTAAAAATAGGATACCAGAGGCTACAATCAATACACCCTCGATAGAGGCTGTCGAATGTGGCC 
AACTGGCAACGCTGCGGTTAGTCATCGTCGGAGACTTTCTGGGAa?TCATTTTCTTCCGAGTCTCCGCCTGCTTATTA 
AGGCATCAATCTGGATGCTCCACTGTGGTACATCCAATTTTCGATTTTTCTTCGGCAGAGGCAAGGATTCCACTGGT 
TCAGTCTAGGCATTTAGAAGATCAAAGCTGTCCTGTACCTCCGTACCTGGGTGTTCGACGTCATTGCCACGTTTCGA 
2 0 CCCAAGGGCAGACGCCATGTCGCCGAGCGATCGCCGCGATATGCCTCGAATTTGCGCCATTCGGCATCCAGTTTCCA 
GTGCCCTTCCCCGAATGACTGTCTCCACTATTCGGCAAGATTGTAAATCAAGCCTGAAGAAGCGGAGCATTCTTGGA 
AGTCGTATGTTCTACTGATTCTGTGCCTGGCGCAGACGGGTATATAATAAAGATCACGCACCGAGGAGTTCTTA 

SEQ ID No. 46 
25, GTTCGACGTCATTGCCACG 

SEQ ID No. 47 
CCTTGATCGTTGCTGAGCG 

30 SEQ ID No. 48 

ATGACTGTCGCCGATATCG 

SEQ ID No. 49 
CTATACATCGAAAATAGACTGC 

35 

SEQ ID No. 50 

CCGTCCTGGGCGGAGTATTGGCAGAG 

SEQ ID No. 51 
4 0 GCGAATCAGATTCTAGAGGAGCAGGATATCG 

SEQ ID No, 52 

GCTCAGCACCTCGGCGTGGAAATCTCC 

45 SEQ ID No. 53 

TCTGCCAATACTCCGCC 

SEQ ID No. 54 
CTTTCCGGCCGGCATG 

50 

SEQ ID No. 55 

GGTATTGAGGGTCGCATGACTGTCGCCGATATCGA 

SEQ ID No. 56 
5 5 AGAGGAGAGTTAGAGCCTACATCGAAAATAGACTGCTTGTACACC 

SEQ ID -No. 57 

GGTATTGAGGGTCGCATGTCGCAACCTGTTGTG 

60 SEQ ID No. 58 

AGAGGAGAGTTAGAGCCTATATCTTCTCGAGTTTCTTCC 

SEQ ID No. 59 

GGTATTGAGGGTCGCATGGGTTCCAACGCCTTC 

65 
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SEQ ID No. 60 

AGAGGAGAGTTAGAGCCTAAATGGCCCTGCCAAACTG ' 

SEQ ID No,. '61 

GGTATTGAGGGTCGCATGGCTCTCCCTGACGTCGAAA 

SEQ ID No..' 62 , .. " . ' ' ■ " 

AGAGGAGAGTTAGAGCCTACTCAAAGATGCTCTCC 

SEQ ID No.. 63 

GGTATTGAGGGTCGCATGACAGTTCCATACCAAG 
SEQ ID No. 64 

AGAGGAGAGTTAGAGCCTAATTTACTTCTAATTTAGATGTTC 
SEQ ID No. 65 

GGTATTGAGGGTCGCATGTCGGCAGAAAAGAAG 
SEQ ID No. 66 

AGAGGAGAGTTAGAGCCCAAATCCTTGCACCCTTGCGCC 

SEQ ID No. 67 
CAGACCAATGGCCAGAAGA 

SEQ ID No. 68 
AGATGGGCGATGTGGTAGTC 

SEQ ID No. 69 
gccgcttacagggaatgata 

SEQ ID No. 70 . * 
atggctcaatctgcgagtct 

SEQ ID No. 71 
CGACTCTTGTGGGTGCTGTA 

SEQ ID No. 72 
GTGGAAAACACCCATTCTGG 

SEQ ID No. 73 
CCCCAATCGTCAGATGAAGT 

SEQ ID No. 7 4 
CTGGCCCACGATTCACTAAT 

SEQ ID No. 75 
caaaagatcgccatccaact 

SEQ ID No. 76 
Ctggtgacgagtccctcaat 

SEQ ID No. 77 
ccagcagatgttcgaccccaag 

SEQ ID No. 7 8 
cagtgaactccatctcgtccatac 

SEQ ID No. 7 9 
TCCGTGGCGTCACCTTCC 

SEQ ID No. 80 
CAGATGGGCGATGTGGTAGTC 

SEQ ID No 81 ' 

tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggt 
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cagcttgtct gtaagcggat gccgggagca 
ttggcgggtg tcggggctgg cttaactatg 
accatatgcg gtgtgaaata ccgcacagat 
attcgccatt caggctgcgc aactgttggg 
.5 tacgacagct gtctcttata cacatctcaa 
cgcatattgg ctcgaattcg agctcggtac 
■ .gtttgtggat taacattgtt ccgatgtagg 
gcccctcttc cagagcattt ttggtgggat 
agtccttccg tttctactgc gtcttacatc 

10 catatgaata agaggcactc aggttttccc 
ccaaatgcat cgggagtttc tctatcataa 
tctttccgaa gagctggtag caactgcacg 
gagcacgcag cacggctatt ggtcagcatg 
atatcgtgag tctcctgctt tgcccggtgt 

15 tgggcggcgc atgtcgggaa accagcagca 
attaccgtac taatcaatta tttgtgtagc 
agcagctatt tggcctttag ccccgtctgt 
aagctcaagg tgatcaggtc gttgcgtcag 
tgcccctccc ccgctggtgg cctttttctc 

20 tcattaatct actgtctctc tttctttcta 
tccctcaatc ccgtctacaa tagtgtcctc 
tgactgacat ttaccccgct cagtaccaga 
ccatggccaa gttgaccagt gccgttccgg 
tcgagt'tctg gaccgaccgg ctcgggttct 

25 gtgtggtccg ggacgacgtg accctgttca 
acaacaccct ggcctgggtg tgggtgcgcg 
aggtcgtgtc cacgaacttc cgggacgcct 
agccgtgggg gcgggagttc gccctgcgcg 
ccgaggagca ggactgagaa ttccactagt 

30 taagtctagt gccactattc tatgatgagt 
attctccaag atcatgtctc actcaaaact 
tctcatcatt tctgggttta gaaacatctc 
tagcggtttc actgaaatga atacatttgg 
agggcagtaa cacatcacgt acattctatc 

35 cttttatgct tcctcctttc ttaccattta 
ggcccctgat tgtattgtca cctcaccaaa 
cttttatgga cagcaagcga accggaattg 
aagccctgca aagtaaactg gatggctttc 
tcaagctctg atcaagagac aggatgagga 

40 cacgcaggtt ctccggccgc ttgggtggag 
acaatcggct gctctgatgc cgccgtgttc 
tttgtcaaga ccgacctgtc cggtgccctg 
tcgtggctgg ccacgacggg cgttccttgc 
ggaagggact ggctgctatt gggcgaagtg 

45 gctcctgccg agaaagtatc catcatggct 
ccggctacct gcccattcga ccaccaagcg 
atggaagccg gtcttgtcga tcaggatgat 
gccgaactgt tcgccaggct caaggcgagc 
catggcgatg cctgcttgcc gaatatcatg 

50 gactgtggcc ggctgggtgt ggcggaccgc 
attgctgaag agcttggcgg cgaatgggct 
gctcccgatt cgcagcgcat cgccttctat 
aacgcttaca atttcctgat gcggtatttt 
gcctgcaggt cgacctgcag gcatgcaagc 

55 gcttcagggt tgagatgtgt ataagagaca 
• gagaggcggt ttgcgtattg ggcgctcttc 
ggtcgttcgg ctgcggcgag cggtatcagc 
agaatcaggg gataacgcag gaaagaacat 
ccgtaaaaag gccgcgttgc tggcgttttt 

60 caaaaatcga cgctcaagtc agaggtggcg 
gtttccccct ggaagctccc tcgtgcgctc 
cctgtccgcc tttctccctt cgggaagcgt 
tctcagttcg gtgtaggtcg ttcgctccaa 
gcccgaccgc tgcgccttat ccggtaacta 

65 cttatcgcca ctggcagcag ccactggtaa 



gacaagcccg tcagggcgcg tcagcgggtg 120 

cggcatcaga gcagattgta ctgagagtgc 180 

gcgtaaggag aaaataccgc atcaggcgcc 240 

aagggcgatc ggtgcgggcc tcttcgctat 300 

ccatcatcga tgaattttct cgggtgttct 3'60 

ccggggatcc tctagaagtc ctgaatagta 420 

aatcatgatc ccaaccagaa gagctggaca 4 80 

gttttggctt agtgcgatgc aactggacaa 54 0 

atctggtatc tacgcaagcc gcccacttac 600 

tcaccccccc gaagcgatgg taagcgggtg 660 

taacctaggt attccgtaat ctattaccag 720 

agatttgtag gagcgagtac ccggctggac 780 

gtagctaccg aggggaggca ggccgcccaa 840 

atgaaaccgg aaaagctgct atagagcttc 900 

agctgaccca gaaagacccg tcctcaagcc , S60 

aacactggga agctgtagtg cataggctgg 1020 

ccgcccggtg tgcggtttcg actggcgcgc 1080 

tcggagacaa caagccattg ccttttctac 1140 

tcatcttctc ctctcttccc atcatcagca 1200 

tcattctata aagtaagaac atatccatct 1260 

ttcactactc tgtctctatc tctcaaagct 1320 

cgaatctaca cagaattcga gctcactaaa 1380 

tgctcaccgc gcgcgacgtc gccggagcgg 14 4 0 

cccgggactt cgtggaggac gacttcgccg 1500 

tcagcgcggt ccaggaccag gtggtgccgg 15 60 

gcctggacga gctgtacgcc gagtggtcgg 1620 

ccgggccggc catgaccgag atcggcgagc 1680 

acccggccgg caaetgcgtg cacttcgtgg 1740 

gcagaaagct gttttccttg ctctgtggta 1800 

tgatgactct ttcatgactg gaaggcttac 1860 

tatctcgggt tcactttcgg gttccatata 1920 

tctcgttttt gcagctcttc tacgtactcc 1980 

gtaacctaat tgccaattca tatcttcctg 2040 

agctgtgata gagttacaaa actagcaata 2100 

cacatccgct ttctctctgc tcttgatctt 2160 

ttcaagtcat cacctcttct ctagagtcga 2220 

ccagctgggg cgccctctgg taaggttggg 228 0 

tcgccgccaa ggatctgatg gcgcagggga 234 0 

tcgtttcgca tgattgaaca agatggattg 24 00 

aggctattcg gctatgactg ggcacaacag 24 60 

cggctgtcag cgcaggggcg cccggttctt 2520 

aatgaactgc aagacgaggc agcgcggcta 2580 

gcagctgtgc tcgacgttgt cactgaagcg 2 640 

ccggggcagg atctcctgtc atctcacctt 27 00 

gatgcaatgc ggcggctgca tacgcttgat 27 60 

aaacatcgca tcgagcgagc acgtactcgg 2820 

ctggacgaag agcatcaggg gctcgcgcca 28 8 0 

atgcccgacg gcgaggatct cgtcgtgacc 2940 

gtggaaaatg gccgcttttc tggattcatc 3000 

tatcaggaca tagcgttggc tacccgtgat 30 60 

gaccgcttcc tcgtgcttta cggtatcgcc 312 0 

cgccttcttg acgagttctt ctgaattatt 3180 

ctcgcatgca tcactagtga attcgcggcc 32 4 0 

ttgccaacga ctacgcacta gccaacaaga 3300 

gctgtcttaa tgaatcggcc aacgcgcggg 3360 

cgcttcctcg ctcactgact cgctgcgctc 3420 

tcactcaaag gcggtaatac ggttatccac 3480 

gtgagcaaaa ggccagcaaa aggccaggaa 3540 

ccataggctc cgcccccctg acgagcatca 3 600 

aaacccgaca ggactataaa gataccaggc 3 660 

tcctgttccg accctgccgc ttaccggata 3720 

ggcgctttct catagctcac gctgtaggta 3780 

gctgggctgt gtgcacgaac cccccgttca 3840 

tcgtcttgag tccaacccgg taagacacga 3 900 

caggattagc agagcgaggt atgtaggcgg 3960 
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tgctacagag ttcttgaagt ggtggcctaa ctacggctac actagaagga cagtatttgg 4020 

tatctgcgct ctgctgaagc cagttacctt cggaaaaaga gttggtagct ctt'gatccgg 4080 

caaacaaacc accgctggta gcggtggttt ttttgtttgc aagcagcaga ttacgcgcag 4140 

aaaaaaagga tctcaagaag atcctttgat cttttctacg gggtctgacg ctcagtggaa 4200 

cgaaaactca cgttaaggga ttttggtcat gagattatca aaaaggatct tcacctagat 42 60 

ccttttaaat taaaaatgaa gttttaaatc aatctaaagt atatatgagt aaacttggtc 4320 

tgacagttac .caatgcttaa tcagtgaggc acctatctca gcgatctgtc tatttcgttc 43 80 

atccatagtt gcctgactcc ccgtcgtgta gataactacg atacgggagg gcttaccatc 4440 

tggccccagt gctgcaatga taccgcgaga cccacgctca ccggctccag atttatcagc 4500 

aataaaccag ccagccggaa gggccgagcg cagaagtggt cctgcaactt tatccgcctc 4560 

catccagtct attaattgtt gccgggaagc tagagtaagt agttcgccag ttaatagttt 4 620 

gcgcaacgtt gttgccattg ctacaggcat cgtggtgtca cgctcgtcgt ttggtatggc 4 680 

ttcattcagc tccggttccc aacgatcaag gcgagttaca tgatccccca tgttgtgcaa 4740 

aaaagcggtt agctccttcg gtcctccgat cgttgtcaga agtaagttgg ccgcagtgtt 4800 

atcactcatg gttatggcag cactgcataa ttctcttact gtcatgccat ccgtaagatg .4860 

cttttctgtg actggtgagt actcaaccaa gtcattctga gaatagtgta tgcggcgacc 4920 

gagttgctct tgcccggcgt caatacggga taataccgcg ccacatagca gaactttaaa 4 9 80 

agtgctcatc attggaaaac gttcttcggg gcgaaaactc tcaaggatct taccgctgtt 5040 

gagatccagt tcgatgtaac ccactcgtgc acccaactga tcttcagcat cttttacttt 5100 

caccagcgtt tctgggtgag caaaaacagg aaggcaaaat gccgcaaaaa agggaataag 5160 

ggcgacacgg aaatgttgaa tactcatact cttccttttt caatattatt gaagcattta 5220 

tcagggttat tgtctcatga gcggatacat atttgaatgt atttagaaaa ataaacaaat 5280 

aggggttccg cgcacatttc cccgaaaagt gccacctgac gtctaagaaa ccattattat 5340 

catgacatta acctataaaa ataggcgtat cacgaggccc tttcgtc 5387 

SEQ ID No. 82 

ATGACAGTTCAATCACAGCAACAATCCCAGGCTATTCCCGTCCTTTCTTCCCAGAATGGCACTGAACCCCAAGACGC 
AAACAAGGAGGTTGTTCAGAATGTCGCTGCCAAAGGAGTGCAATACTTCAACCCTGAGCAACTTCCTGCACCAGGTC 
TCGGTATAAACGGTCCCAATAATACTCTACCAAAGGTCTTTACACCCATCAAGATTCGCGGCATGACCATGCCCAAC 
CGTATCTGGGTCAGCCCCATGTGCCAATACAGTGCCCGTGACGGCTTTCAGCAGCCTTGGCACTTTGCCCACTACGG 
CGGACTGGCCCAACGTGGCCCTGGCCTCATCATGCTAGAAGCTACCGCAGTTCAAGCACGTGGCCGTATCACACCTG 
AAGATTCTGGCATCTGGCTAGACTCTCATGTTGAGGGACTGCGAAAGCACGTCGAGTTTGCCCATGCCAACAACTCT 
CTTATCGGTATCCAGATTGGCCATGCTGGTCGCAAGGCCTCCTGCGTTGCTCCTTGGTTAGACGCCGGACTTGCCGC 
TGAAAAGGCCGCTGGTGGATGGCCCGATGACGTTGTCGGACCTAGCAACGAGCCTTTTGCTCCTGGCTACCCTACCC 
CCCGTGCTATTACTCTTGAAGAGATTGAACAGTTGAAGGAGGACTTTGTTTCCGGTGTTCGTCGAGCGGTTGAAGCA 
GGATTTGACACTATCGACTTCCATTTCGCTCACGGTTATCTTGTTTCCAGCTTCCTGTCCCCTGCCACCAACAAGCG 
TACCGACAAGTACGGAGGTAGCTTCGAGAACAGAGTGCGCCTTGCTCTCGAGATTGTCGAGGCTGCACGAGCTGTTA 
TGCCTGAGGACATGCCCTTGTTCACTCGCATCAGTGGAACTGACTGGCTGGAGAACAACCCTGAGTACGAGGGAGAG 
ACCTGGACTCTTGAGCAGAGCATCAAGCTTGCACACCAGTTAGCAGACCGTGGTGTCGATGTTTTGGATGTTTCCAG 
TGGTGGCATCCACAAGATGCAAAAGGTCGCTGCTGGTCCCGGTTACCAGGCACCTCTTGCCAAGGCGATCAAGAAGT 
CAGTTGGAGACAAGATGTTGATCAGCACTGTTGGTAGCATCAAGATAGGTACCCTTGCGGAGGAGATCATCGCTGGA 
GGAGAGGACGATACCCCCTTGGATCTTGTGGCTTCAGGCCGTCTGTTCCAGAAGAACACTGGACTTGTTTGGTCATG 
GGCTGACGATCTGAACACTTCTATCCAGATCGCTCATCAGATCGCATGGGGTTTCGGTGGCAGAGCTAAGAAGAACG 
CTCCCAAGCTTGTCTTA 

SEQ ID No. 83 

FG00074.1 hypothetical protein 38131 39459 -h 

MTVQSQQQSQAIPVLSSQNGTEPQDAJJ^KEWQNVAAKGVQYFNPEQLPAPGLGINGPNNTLPKVFTPIKIRGMTMPN 

RIWVSPMCQYSARDGFQQPWHFAHYGGLAQRGPGLIMLEATAVQARGRITPEDSGIWLDSHVEGLRKHVEFAHANNS 

LIGIQIGHAGRKASCVAPWLDAGLAAEKAAGGWPDDWGPSNEPFAPGYPTPRAITLEEIEQLKEDFVSGVRRAVEA 

GFDT I DFHFAHGYLVS S FL S PATNKRTDKYGGS FENRVRLALE I VEAARAVMPEDMPLFTRI S GTDWLENNPE YEGE 

TWTLEQSIKLAHQLADRGVDVLDVSSGGIHKMQFCVAAGPGYQAPLABCAIKKSVGDKMLISTVGSIKIGTLAE^ 

GEDDTPLDLVASGRLFQKNTGLVWSWADDLNTSIQIAHQIAWGFGGRAKKNAPKLVL 

SEQ ID No. 84 

ATGGACACGTCTCGATTCGTGTCTGGTCTCACACCGCCTCTCGTCGACTCGATCGATGCACTCAAGATCAGCAACTT 
TGTCCCCACTCGAAGTGGCCACCCTCCTCCTGGCTCGGTCCCGGAATCCATCCTGCCAGAGGGTGTCAAAAAACCGG 
CTTTGTTCCAAACGTTGACATTGCCCTTTGCTGCACCGGAACAGGCGGGTAAGATGACCTTCAAGAACCGCATCATT 
GTCTCTCCCATGTGCCAGTACTCTGCGAACAATGGTCTTCCTACTCCGTACCACATTGCGCATTTGGGATCGTTTGC 
CCTGCACGGTGTGGGAAACGTCATGGTCGAAGCATCTGGTGTTGAGCCAGAGGGGAGGATCACGCCTCAGGACCTGG 
GTATTTGGTCGGAACAGCATCGGGATGCACACAAGGCGCTGGTGTCGGTGCTCAAGTCCTTCACGGATGGTCTGGGO? 
GTAGGGCTGCAACTGGCGCATGCGGGAAGGAAGGCCTCGGACTGGTCACCTTTCTACCGCGGAGAAAAGAAGCAAAA 
GTTTGTGACGCAGGAGGAAGGTGGCTGGCCGGATCGTGTCGTCGCTCCTTCGGCCATCGCATATGCGCAAGGTCACG 
TTACCCCTCGAGCTCTCAGGACCGAGGACATCAACAAGTTGCAAGACAAATTCGTTCAGTCGGCACGATGGGCGTTT 
GAAGCTGGGTATGACTACGTCGAACTTCACAGCGCTCACGGATACCTGATGCACTCGTTCCTCAGCCCGTTGACCAA 
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TCAGCGTACCGACGAGTACGGCGGTAGCCTGGAGAACCGCGCTCGATTTCTGCTCAACGTTGCCCGTCGAATCCGCC 
AAGAATTCCCCAACAAGGGTCTCTGGGTGCGCGTCAGCTCCACCGACTGGGCCGACCAAGCGCACCAAGCCGACTCT 
TGGACCGTTGACCAGACGGTTGAACTCGCCAAGATGCTCCAAGAGGCTCGAGTCGACCTGCTAGACGTCAGCTCCGG 
CGGCCTGGTTCCATTCCAAAAAATCACCGTGGGAGCCGGATACCAGCTATTCGGAGCAAAAGCCGTTCGCGATGCTC 
5 TGGCCAAAATCGAACCCGACGCGTCCAAACGCATGCTCGTCGGGGCCGTGGGAATGATGGAAGGTTCCTACGATTCG 
CCCAACGGCCAAGACCGCAGCCAGATTGGCAAGTTGGCCGAGCAGTCGATTCAGAGCGGAGAGTGTGATGCGGTACT 
GTTGGCACGTGGATTGATGTCCTACCCAAGCTGGACCGAGGATGCTAGTGTAGCGCTGATGGGTACCAGGGCAGCTG 
GCAACCCGCAGTACCATCGGGTTCACGTGGCTAAGAAGTGA 

10 SEQ ID No, 85 . • 

MDTSRFVSGLTPPLVDSIDALKISNFVPTRSGHPPPGSVPESILPEGVKKPALFQTLTLPFAAPEQAGKMTFBCNRII 

VSPMCQYSANNGLPTPYHIAHLGSFALHGVGNVMVEASGVEPEGRITPQDLGIWSEQHRDAHKALVSVLKSFTDGLG 

VGLQIx?UmGRKASDWSPFYRGEPCKQKFVTQEEGGWPDRVVAPSAIAYAQGHVTPRALTTEDINKLQDKFVQSARWAF 

EAGYDYVELHSAHGYLMHSFLSPLTNQRTDEYGGSLENRARFLLNVARRIRQEFPNKGLWVRVSSTDWADQAHQADS 

15 WTVDQTVELAKMLQEARVDLLDVSSGGLVPFQKITVGAGYQLFGAKAVRDALAKIEPDASKRMLVGAVGMMEGSYDS 

PNGQDRSQIGKLAEQSIQSGECDAVLLARGLMSYPSWTEDASVALMGTRAAGNPQYHRVHVAKK 

SEQ ID No. 86 

MSALFEPYTLKDVTLRNRIAIPPMCQYMAEDGLINDWHQVHYASMARGGAGLLVVEATAVAPEGRITPGCAGIWSDA 
20 HAQAFVPWQAIKAAGSVPGIQIAHAGRKASANRPWEGDDHIGADDARGWETIAPSAIAFGAHLPNVPRAMTLDDIA 
RVKQDFVDAARRARDAGFEWIELHFAHGYLGQSFFSEHSNKRTDAYGGSFDNRSRFLLETLAAVREVWPENLPLTAR 
FGVLEYDGRDEQTLEESIELARRFKAGGLDLLSVSVGFTIPETNIPWGPAFMGPIAERVRREAKLPVTSAWGFGTPQ 
LAEAALQANQLDLVSVGRAHLADPHWAYFAAKELGVEKASWTLPAPYAHWLERYR 

25 : SEQ ID No. 87 - 

MSALFEPFRLRDTTIPNRIWMPPMCQYSAAPEGPSAGVPGDWHFAHYGARAVGGTGLIWEATGVSPEGRISPQDLG 
LWNDTQVEAFRRITGFLRSQGTVPAVQLAHAGRKASTAQPWRGGAPVGADAYGWQPLAPSALAFDERHPVPTELTVP 
QIQEAVGRFADAARRALAAGFEIAEIHGAHGYLIHEFLSPHSNQRTDAYGGSYANRTRFALEVVDAVREVWPDDKPL 
FFRVSATDWLEEGGWTPDDTVRFARDLEAHGIDLLDVSTGGNVPRVRIPTGPGYQVPFAARVKAGSTLPVAAVGLIT 

3 0 EPGQAEKILANGEADAVLLGRELLRNPSWAQHAARELGVDARMPDQYGWGM 

SEQ ID No. 88 

MTVSSAAAPQPASPAAPLLFTPLKLRSLELPNRVVVSPMCTYSATDGVANEFHLVHLGQYALGGAGLILAEATAVSP 
EGRITPEDLGLWDDRQIVPLGHITDFVHQHGGHIGVQLAHAGRKASTYAPWRGKGAVPAELGGWQVIGPDENSFHDL 
35 FPTPAMyiGADELRGWDAFSAAARRAQVAGFDAVEVHAAHGYLLHQFLSPLANTRTDDYGGSFENRTRLLLEVVRA 
RHVWPAHLPLFVRLSATDWAEGGWDLEQTVQLSKLLKYEGVDVLDISSGGLTAAQQIEVGPGYQVPFAAAVSRAETE 
ISVMAVGLIETGAQAEAILQAGDADLIALGRPFLRDPHWAQRAARELGLRPVSIDQYARAGW 

' SEQ ID No. 89 

40 MRIVCIGGGPAGLYFAILMKKLNPAHEIRVIERNRPYDTFGWGWFSDATMDNMREWDSETADAIQVAFNHWDDIEL 
HFKGRTIRSGGHGFVGIGRKMMLNILQARCEELGVELVFDREVESDAEFPDADLVIASDGINSRIRNKYAEVFKPDI 
VTRPNRYIWLGTTKLFDAFTFFFEKTEHGWFQAHIYKFDDKTTTFIVECPEHVWKAHGLDTADQEQSIAFCEQLFGK 
HLDGHRLMTNSRHLRGSAWLNFQRVKCEQWHHYNGKSHWLMGDAVHTAHFAIGSGTKLALEDAIELTRLFRDEGDT 
REHIPAVLERYQAARNIDVLRLQNAAWNAMEWFEVCGARYCDTLEPEQFMYSMLTRSQRISHENLRLRDAGWLEGYE 

45 RWI^ARKAGMTVRDDETPPPPMFTPFKLRGLTIJysrRIVMSPMAMYSAEDGAP^ 

SPDARITPGCAGMYKPEHVNAWKRIVDFVHGNSDAKIGMQLGHAGRKGATKLAWEGIDEPLEAGAWELISASPLPYL 
PHSQVPRAMTRDDMERVRNDFVRATRMAAEAGFDILELHCAHGYLLSSFLSPLTNRRTDEFGGDLENRARFPLEVFK 
AMRAMWPTNRPMSVRLSCHDWFPGGNTADDAVAIARLFKEAGADIIDCSSGQWKGDQPVYGRMYQTPFADRIRNEV 
GIPTLAVGAISEADHANSIIAAGRADLCAIARPHLADPAWTLHEAAKIGFGEVAWPKQYRSARGQYETNLQRAAAAV 

50 AGK 

SEQ ID No. 90 

MREEPSSAQLFKPLKVGRCHLQHRMIMAPTTRFRADGQGVPLPFVQEYYGQRASVPGTLLITEATDITPKAMGYKHV 
PGIWSEPQREAWREIVSRVHSKKCFIFCQLWATGRAADPDVLADMKDLISSSAVPVEEKGPLPRALTEDEIQQCIAD 
55 FAQAARNAINAGFDGVEIHGANGYLIDQFTQKSCNHRQDRWGGSIENRARFAVEVTRAVIEAVGADRVGVKLSPYSQ 
YLGMGTMDELVPQFEYLIAQMRRLDVAYLHLANSRWLDEEKPHPDPNHEVFVRVWGQSSPILLAGGYDAASAEKVTE 
QMAAATYTNVAIAFGRYFISTPDLPFRVMAGIQLQKYDRASFYSTLSREGYLDYPFSAEYMALHNFPV 

SEQ ID No. 91 

60 MTIRKLDGEESlffiFQPLEIANGRIRLSHRWHAPMTRNRGVPLNPTSTPEQPNRIWYPGDLMVQYYRQRATPGGLII 
SEGVPPSLESNGMPGVPGLWTPEQAAGWKRWDAVHEQGGYIYCQLWHAGRATIPQMTGSPAVSASATVWDSPTECY 
SHPPVGSTEPVRYADHPPIELTIPHLKQTIRDYCNAAKTAMEIGFDGVELHAGNGYLPEQFLSSNVNKRTDEYGGSP 
EKRCRFVLELMDELAATVGEDNLAIRLSPFGLFNQARGEQRVETWTFLCESLKKAHPNLSYVSFIEPRYEQIFSYEE 
KDNFLRSWGIiSDVDLSSFRKIFGTTPFFSAGGWDQSNSWGVLEEGRYDALIiYGRYFTSNPDLVERLRKGIPFTPYDR 

6 5 SRFYGPFEDNAKCYVDYPPATAS S 
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SEQ ID No. 92 • ''• 

MTVESTNSFWE^AGTKQIEIAPLGSTKLFQPJKVGKNILPHRVAHAPTTRFRAAK^^HTPSDLQL^ 
IITEATFTSEQGGMDLHVPGIYNDAQTKAWKKINDEIHANGSFSSMQLWYLGRVANPKDLKDAGLPLIGPSAVYWDE 
ESEKLAKSVGNELRELTEKEIDHIVEVEYPNAAKRAIEAGFDYIEVHSAPGYFLDQFLNPASNKRTDKYGGSIENRA 
, RLLLRIIDKLIGIVGAEKLAVRIAPWSSFLGMEIEGEEIHSYILQQLQQRADNGQQIAYVSLIEPRVJGI 

QKGRSNEFAYKYWKGNFVRAGNYTYDAPEFKTLLHDLDNDRTIVGFARFFTSNPDLVEKLKLGKPLNHYDREEFYKY 
YNYGYNSYDESEKQVIGKPLV 

SEQ ID No. 

MTIESTNSFV^PSDTKLIDVTPLGSTKLFQPIKVGlSnsiVLPQRIAYVPTTRFRASKDHIPSDLQLNYYNA^ 
^ IITEATFASERGGIDLHVPGIYNDAQAKSWKKINEAIHGNGSFSSVQLWYLGRVANAKDLKDSGLPLIAPSAVYWDE 
NSEKLAKEAGNELRALTEEEIDHIVEVEYPNAAKHALEAGFDYVEIHGAHGYLLDQFLNLASNKRTDKYGCGSIENR 
ARLLLRWDKLIEWGANRLALRLSPWASFQGMEIEGEEIHSYILQQLQQRADNGQQLAYISLVEPRVTGIYDVSLK 
DQQGRSNEFAYKIWKGNFIRAGNYTYDAPEFKTLINDLKNDRSIIGFSRFFTSNPDLVEKLKLGKPLNYYNREEFYK 
YYN YG YN S YDE SEKQVI GKPIi 

SEQ ID No. 94 

MAATAAESRLFQPLKLTPKITLGHRLAMAPLTRFRSDDEHVPIVPLMTTYYSQRASVPGTLLVTEATFXSPAAGGYD 
NVPGI YNAAQIAAWKKITDAVHAKGS FI FCQLWS LGRAANPEVLAKEGGLKLKS S SAVPMEEGAPVPEEMTVAE IKE 
RVAEYAAAAECNAVEAGFDGVEIHGANGYLIDQFLQDTCNQRTDEYGGSIENRSRFAHEVVKAWEAVGAEKTGIRLS 
PYSTFQGMKMKKDLIPQFEDVIRKINGFGIAYLHLTQSRVAGNMDVQPEEDEENLAFAAKLWDGPLLIAGGLTPETA 
KHLVDREFPEKDWATFGRHFISTPDLPFRIKEGIELNPYDRDTFYKAKSPDGYIDQPFSKEFEKVYGAQA 

SEQ ID No. 95 

MSFVKDFKPQALGDTNLFKPIKIGNNELLHRAVIPPLTRMRALHPGNIPNRDWAVEYYTQRAQRPGTMIITEGAFIS 
PQAGGYDNAPGWSEEQMVEWTKIFNAIHEKKSFVWQLWVLGWAAFPDNLARDGLRYDSASDNVFMDAEQEAK^^ 
ANNPQHSLTKDEIKQYIKEYVQAAKNSIAAGADGVEIHSANGYLLNQFLDPHSNTRTDEYGGSIENRARFTLEWDA 
LVEAIGHEKVGLRLSPYGVFNSMSGGAETGIVAQYAYVAGELEKRAKAGKRLAFVHLVEPRVTNPFLTEGEGEYEGG 
SNDFVYSIWKGPVIRAGNFALHPEVVREEVKDKRTLIGYGRFFISNPDLVDRLEKGLPLNKYDRDTFYQMSAHGYID 
YPTYEEALKLGWDKK 

SEQ ID No. 96 

MPFVKDFKPQALGDTNLFKPIKIGNNELLHRAVIPPLTRMRAQHPGNIPNRDWAVEYYAQRAQRPGTLIITEGTFPS 
PQSGGYDNAPGIWSEEQIKEWTKIFKAIHENKSFAWQLWLGWAAFPDTLARDGLRYDSASDNVYMNAEQEEKA^ 
ANNPQHSITKDEIKQYVKEYVQAAKNSIAAGADGVEIHSANGYLLNQFLDPHSNNRTDEYGGSIENRARFTLEVVDA 
WDAIGPEKVGLRLSPYGVFNSMSGGAETGIVAQYAYVLGELERRAKAGKRLAFVHLVEPRVTNPFLTEGEGEYNGG 
SNKFAYSIWKGPIIRAGNFALHPEWREEVKDPRTLIGYGRFFISNPDLVDRLEKGLPLNKYDRDTFYKMSAEGYID 
YPTYEEALKLGWDKN 

SEQ ID No, 97 . 

MPFVKGFEPISLRDTNLFEPIKIGNTQLAHRAVMPPLTRMRATHPGNIPNKEWAAVYYGQRAQRPGTMIITEGTFIS 
PQAGGYDNAPGIWSDEQVAEWKNIFLAIHDCQSFAWVQLWSLGWASFPDVLARDGLRYDCASDRVYMNATLQEKAKD 
ANNLEHSLTKDDIKQYIKDYIHAAKNSIAAGADGVEIHSANGYLLNQFLDPHSNKRTDEYGGTIENRARFTLEWDA 
LIETIGPERVGLRLSPYGTFNSMSGGAEPGIIAQYSYVLGELEKRAKAGJOILAFVHLVEPRVTDPSLVEGEGEYSEG 
TNDFAYSIWKGPIIRAGNYALHPEWREQVKDPRTLIGYGRFFISNPDLVYRLEEGLPLNKYDRSTFYTMSAEGYTD 
YPTYEEAVDLGWNKN 

SEQ ID No. 98 

GCTAGCATGACTGTCGCCGATATCGA 
SEQ ID No. 99 

GCTAGCCTATACATCGAAAATAGACTGC 

SEQ ID No. 100 
ACTAGTCCAGGGGACTGTCGTGGTCAA 

SEQ ID No. 101 
CAATTGCCCAGGCCTAATGCATGCTG 
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CLAIMS 

1. Method of identifying an anti-fungal agent which targets an essential protein or gene 
of a fimgus comprising contacting a candidate substance with 
5 (i) a NADH:flavin oxidoreductase protein which comprises the sequence shown 

by SEQrDNO:3, 

(ii) a NADHrflavin oxidoreductase protein which is a homologue of (i) and 
which comprises the sequence shown by SEQ ID NO: 8, 12,-14, 19, 24, 42, 44, 83 or 

85, 

1 0 (iii) a protein which has 50% identity with (i) or (ii), 

(iv) a protein comprising a fragment of (i), (ii) or (iii) which fragment has a 
length of at least 50 amino acids, 

(v) a polynucleotide that comprises sequence which encodes (i), (ii), (iii) or (Iv), 

(vi) a polynucleotide comprising sequence which has at least 70% identity with 
1 5 the coding sequence of (v), 

and determining whether the candidate substance binds or modulates (i), (ii), (iii), (iv), 
(v) or (vi), wherein binding or modulation of (i), (ii), (iii), (iv), (v) or (vi) indicates that 
the candidate substance is an anti-fungal agent. 

2 0 2. Method according to claim 1 wherein (iii) or (iv) have an oxidoreductase activity. 

3. Method according to claim 1 or 2 wherein (i), (ii), (iii) or (iv) comprise one or more 
of the motifs defined by regions 1 to 1 1 in Figures 1 and 2. 

25 4. Method according to any one of the preceding claims comprisiag carrying out a 
redox reaction iii the presence and absence of the candidate substance to determine 
whether the candidate substance inhibits the oxidoreductase activity of a protein as 
defined m any one of the preceding claims, wherein the redox reaction is carried out by 
contacting said protein with NADH or NADPH; and an electron acceptor, under 

30 conditions in which in the absence of the candidate substance the protein catalyses 
reduction of the electron acceptor. 
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5, Method according to any one of the preceding claims wherein (iii) is a protein 
comprising the sequence of any of the following: SEQ ID NO: 6, 10, 16, 22, 27, 30, 
33,35,38,40. 

6. Method according to any one of the preceding claims wherein the (i) or (ii) is an 
oxidoreductase of Aspergillus flavus; Aspergillus fumigatus; Aspergillus nidulans; 
Aspergillus niger; Aspergillus parasiticus; Aspergillus terreus; Blumeria graminis; 
Candida albicans; Candida cruzei; Candida glahrata; Candida parapsilosis; Candida 
trqpicalis; Colletotrichium trifolii; Cryptococcus neoformans; Encephalitozoon 
cuniculi; Fusarium graminarium; Fusarium solani; Fusarium sporotrichoides; 
Leptosphaeria nodonim; Magnaporthe grisea; Mycosphaerella graminicola; 
Neurospora crassa; Phytophthora capsici; Phytophthora infestans; Plasmopara 
viticola; Pneumocystis jiroveci; Puccinia coronata; Puccinia graminis; Pyricularia 
oryzae; Pythium ultimum; Rhizoctonia solani; Schizzosaccharomyces pombe; 
Trichophyton interdigitale; Trichophyton rubrum; or Ustilago maydis, 

7. Method according to any one of the preceding claims which fixrther comprises 
formulating the identified anti-fungal agent into a agricultural or pharmaceutical 
composition. 

8. Method according to any one of claims 1 to 6 which further comprises killing or 
impairing the growth of a fungus by contacting the fungus with the identified anti- 
fungal agent. 

9. Use of (i), (ii), (iii), (iv), (v) or (vi) as defined in any one of claims 1 to 6 to identify 
or obtain an anti-fimgal agent. 

10. Use of an anti-fimgal agent identified by the method of any one of claims 1 to 6 in 
the manufacture of a medicament for prevention or treatment of fungal infection. 
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1 !• MethLod of detecting the presence of a fungus in a sample comprising detecting the 
presence in the said sample of a protein or polynucleotide as defined in any one of 
claims 1 to 3, 5 or 6. 

12. Method according to claim 11 wherein the sample is firom an hiunan, animal or 
plant individual who is suspected of having a fungal infection. 

13. An isolated protein or polynucleotide as defined in any one of claims 1 to 3, 5 or 
6. . 

14. A vector comprising a polynucleotide as defined in any one of claims 1 to 3, 5- or 
6. 

15. A recombinant cell comprising a polynucleotide as defined in any one -of claims 1 
to 3, 5 or 6 or a vector according to claim 14. 

16. A method of obtaining a protein as defined in any one of claims 1 to 3, 5 or 6 
comprising expressing the protein from a polynucleotide as defined in any one of 
claims 1 to 3, 5 or 6 or a vector according to claim 14. 

17. A method of obtaining a polynucleotide as defined in claim 1 to 3, 5 or 6 
comprising replication of a vector as defined in claim 14 or synthesis of the 
polynucleotide by condensation of nucleotides , 

18. An organism which is transgenic for a polynucleotide as defined in any one of 
claims 1 to 3, 5 or 6. 

19. An organism which has been genetically engineered to render a polynucleotide or 
protein as defined in any one of claims 1 to 3, 5 or 6 non-functional or inhibited. 

20. An antibody which is specific for a protein as defined in any one of claims 1 to 3, 
5 or 6. 
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21. A method for preventing or treating a fungal infection comprising administering 
an anti-fimgal agent identified by the method of any one of claims 1 to 6. 

22. A method for preventing or treating a fungal infection comprising administering a 
protein or polynucleotide as defined in any one of claims 1 to 3, 5 or 6. - 

23. A method of killing, or impairing the growth of, a fungus comprising ijohibiting 
the expression or activity of a polynucleotide or protein as defined in any one of claims 
1 to 3, 5 or 6. 

24. A method according to claim 23 wherein the fungus has infected a human, animal 
or plant individual. 

25. A fungus which has been killed, or whose growth has been impaired, by inhibition 
of the expression or activity of a protein or polynucleotide as defined in any one of 
claims 1 to 3, 5 or 6. 
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SEC 3 
SEQ 6 
5EQ 8 
SEQ 10 
SEQ 12 
SEQ 14 
SEQ 16 
SEQ 19 
SEQ 22 
SEQ 24 
SEQ 27 
SEQ 30 
SEQ 33 
SEQ 33' 
SEQ 30 
SEQ 40 ' 
SEQ 42 
SEQ 44 
SEQ B3 
SEQ 85 
Bacteria 
T44ei2 
NP_625402 
NP 295913 
AF3202S4 
OYE faraily .. 
Af4a7S 
A£4d61 
Ca2460 
NC4452 
ScOYEl 

sco5:e2 

SC0YE3 
A3fi990 



MTVAD XDVPPAEGIP 

— MSQPWPD lENKPAPGIS 
— MGSNAFRS PAVTKSSSTP 

MALPD VENTEAASIP 

MFVSZQVKPS DEIKGTXPBVS 



YETPJlQMePA GTAMIPQTN- 
YFTPAQEPPA GTAAMPQSDG 
YYTPANNGGA ALHPDDPT — 
YErrPAQNPPA GTAANPQTSG 
XYTPEQPVPA GTPYPQSSD- 

— ^MADFTQKK TSSPARPGVP FYTPAQVPAA GTPLPSTSG- 

, MATST 

, GTPiQQQDA- 
. GAAIS-VP — 
i GTPVDASTA- 



MSftEKK TLSKPflAGVP 

MT rVNEGABNVG 

^MTG TANKAAPSVP 

^MAYBI XDNVATiEGVP 



YYTPAQEPPA 
YFTPAQKIPA 
FYTPAQEPPA 
YYTSAQDPPA 




TPLTXR-GVT 
RELSVR-GLT 
RPLQIR-NVT 
TPLTVR-5VT 
QPLKIG-KIA 
QPIKISDSIT 
TPLKCR-GVE 
QPI.TLPMGia' 
KPLKIR-GVE 
TPLKIR-GVE 
KPLRIR-DLT 
TPITIR-GVT 



NRL6IAPLCQ 



IX — 
FH — 

LP 

LP 

LQ— 



FKFT 

IN 

FP 



NRIMVSPMCM 

NRLGLRPLCQ 

NRIGVSPMCQ 

NRIGVSPMCM 

NRFAVAPMCT 



— MRFGVSPMCT 

— NBMEVSPMCT 

— NRIWVSPMCQ 

— NRLFIAPLCQ 



—MP KCEAMGHHKI ZINKEAPNVP FYTPVQDPPA GTSYDVQPEG —r 
fjMSfZ IDNIAREGAP YYTPAQD.PA GTQTSGST 



«~SLP SLIKIR-NLT LQ 

— KVF T.ITIR-GVT FP— — 
LKIR-Gl/P LQ 



NRIFVSPMCQ 

— NRLFLAPLCQ 
NRIMLR6LCQ 



-MSPPRFEAA PADPSPLG TPLKY PVSGR — SAP ■ 



■ NREXNRAMSE 



i AIPVLSSQNG TEPQDANKEV VQNVAAKGVQ YTOPBQLPAP (aSIMGPNNT LPKVE TPIKEH-GMT ME NRIWVSPMCQ 

1 RFVSGLTPPL VDSIDASUaS NFVPTRSGHP PPGSVPBSIL PEGVKKPALF QTLTLP-FAA PEQAGKMTFK NRIIVSPMCQ 




•MSALF EPYTLK-DVT LR NRIAIPPMCQ 

■MSALF EPFRLR-DTT IP NRIWMPPMCQ 

— PLLF TPLKLR-SLB LP .-NRVVVSPMCT 

■PPPMF TPFKLR-GLT LA NRIVMSPM7*! 



MTI RKLDGBESM 



-LP KPLKVG — RC 3 
LF QPLEIA-NGR XRLS- 



. — MTVESTNS FWPACTKQI EIAPLGSTK LF QPIKVG-KNI LP 

. MAATAAESR- LF QPLKLTPKIT LG 

— F KPQALGDTN — LF KPIKIG-NNE LL— 



— MS FVKD 

MP FVKD F KPQALGDTN- 

MP FVKG F EEISLRDTN- 

— MTIESTNS FWPSrWKLI DVTPLGSTK- 



-LF KPIKIG-NNE K. 

-LF EPIKIG-NTQ LA — 
-LP QPIKVG-NNV LP 



— HRMIMAPTTR 

— HRWHAPMTR 

— KRVTVHRPTTR 

— HRLAMAPLTR 

HRAVIPPLTR 

HRAVIPPLTR 

HRAVMPPLTR 

QRIAYVPTTR 



SEQ 3 
SEQ 6 

SEQ a 

SEQ 10 
SEQ 12 
SEQ 14 

SEQ 16 
SEQ 19 
SEQ 22 
SEQ 24 
SEQ 27 
SEQ 30 
SEQ 33 
SEQ 35 
SEQ 38 
SEQ 40 
SEQ 42 
SEQ 44 
SEQ 83 
SEQ 85 
Bacteria 
T44612 
NP_625402 
NP_295913 
AF320254 
OYE family 
Af487S 
AC4961 
Ca2460 
NC44S2 
ScOYBl 
ScOYEZ 
SCOPES 
A36990 



YSA QDGHM TD — YHIAHL 

YSA DDGHM TP — WHMRHL 

YSCE S DPSSPHVGAL TN — YHLRHL 

YSA BDGHM TD~YH1AHL 

YSA DYMPEA TP — ^YHUHY 

YSS — -SPTDNQA TL— FHFVHY 

YSA DDGHM TD~WHLVHL 

QMG FGNHL PN — PEIiAAV 

YSA DDGHL TD — FHLVHL 

YSA DQBGHL TD — FHLVHL 

YSA DNGHA TD — YHLVHL 

YSA — KDGYA TD— WHLTHL 



GGIRQRGPGL 
GGIAQRGPGF 
GHLALKGAGL 
GGIAQRGPGL 



GSERVRGPAL 
GSFALRGVPL 
YATWARGDWG 
GQFALHGTAL 
GAMGMRGPGL 
GQFALHGAAL 
GSIIQRGPGL 



MLIEAIAVQP 
LMVEy^TAVEP 
VFIEATAVQP 
MKIBATSVSP 
TIVESTAVSP 
IILESIFVSE 
TIFEATGVLP 
LILTGNVQVD 
TIVEATSVTP 
VMVEATAVSP 
SMVEAXAVEA 
SMVSKTAVQN 



E-GRITPQDV 
E-GRITPQDL 
N-GRISPNDS 
E-GRITPQDV 
E-GGLSPHDL 
N-SGLSIHDL 
N-GRITPECS 
HAHKGDAHDI 
N-GRISPEDS 
E-GRISPNDS 
R-GRZSPEDV 
H-6RITPQDV 
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■-GLWK — DS 
— GLWK — DS 
— GLWQ — DG 
—GLWK — DS 
— GIWK — DB 
-GLWN~DD 
— GLWQ — DS 
— SPNH — PG 
—GLWQ — DS 
— GLWFTMES 



-raws — DG 



QIAPMR 

QIEPLS 

TTSEQFLGLK 

QIAPMK 

QAEKLK 

QAHSLR 

QIMLK 

TTPBQTVTAF 

QIAPLR 

QMKPLR 

QIMLK 

QIEPLK 



RVI-DPVHSQ 
RVI-EPVHSQ 
RW-EFMHAQ 
RVI-DFVHSQ 
PIV-DYMSQ 
KIV-DFIHDQ 
RIV-DYIHSQ 
KTWADAARLN 
RIV-DYVHSQ 
RIV-BFAHSQ 
RXV-DFIHSQ 
RIT-TFAHSQ 



GQ-KIGV— Q 
NQ-LIGV — Q 
GA-KVGJ~Q 
SQ-KIGV — Q 
KQ-LIM~Q 
DG-ICCI — Q 
GQ-KAGI — Q 
GQSKTPWVQ 
GQ-KIAI— Q 
NQ-KIGX — Q 
NQ-VAAl— Q 
SQ-KIGI — Q 



YSA— 



. KD6VM TP— WHKQHL GSFAARGPGL IVTEVNAVSP E-GRXSPEDA — GXYD— DG QLGPLR DIV-DFVHSQ GA-KIAI~Q 

. KDGYA TD— WHLTHL GGIIQRGPGL SMVEATAVQN H-GRITPQDV — SLWE— DG QIEPLK RXT-TFAHSQ SQ-KXGI— Q 

, PDGHY TM— WHHTHM GGIIQRGPGL TCVEATAVTP Q-GRITPBDV — SCWQ— DS -QIBPLR KW-EEAHSQ NQ-KIHI — Q 

~VPRD-ABP -SGBRFDMPS KLABftAKBHG SLIV-Ar— Q 

— GIWL — DS HVBGLR KHV-EETUIAN MS-LICH — Q 

—CTHS— BQ HRDRHK ALV-SVLKSF TD-GLGVGLQ 



GLA TF DEADP-SKRG IPTEQLVQLY RHWGQGBWGQ IQTOIVMIDP EHLEAPGNMV 

YSA RDGFQ QP~WHERHY GGLRQRGPGL IMLEAIEAVQA R-GRITPEDS 

YSA NNGLP TP — YHlflHL GSERLHGVGN VMVERSGVEP B-GRITPQDL 



EDGLI KD— WHQVHV ASMARGGAGL LWEATAVAP E-GRITPGCA ~GIWS~DA HAQAFV PW-QAIKnA GS-VPGI— Q 

APBGPSAGVP GD—WHFAHX GRRAVGGTGL IWEATGVSP E-GRISPQDL — GLWN~DT QVEAFR RIT-GFLRSQ GT-VPAV~Q 

TDGVR NE~FHLVHL GQYALGGAGL ILABATAVSP B-GRITPEDL — GLWD — DR QIVPLG HIT-DFVHQH GG-HXGV~Q 

YSA EDGAP TD— FHLVHF GSRALGGACTr LYTEMTCVSP D-ftRITPGCA — CMYK— PB HVNKMK RXV-DFVHGN SDWCXOI— Q 



YMA— 
YSA— 
YSA— 



FRft, DGQG VPLPFVQEYY GQRASVPGTL LXTEATDXTP K-flMGYKHVP — GIWS — BP • 

NRGVPINPTS TPEQPNRI«Y PG-DLMVQYY RQRAT-PGGL IISEGVPPSL E-SNGMPGVP — GLWT — PE ■ 

FRA AKNHT PS-DLQLEYY KTHSQYPGTL IITEATFTSE Q-GGMDLHVP — GIYN— DA ■ 

PRS - — — DDB-HV PIVPLMTTYY SQRASVPGTL LVTEATFISP A-AGGYDNVP — GXYN — AA • 

MRA LHPGNI PNRDWAVEYY TQRAQRPGTM IITEGRFISP Q-AGGYDNAE — GVWS — BE ■ 

MRR QHPGNX PNRDWAVEYY AQRAQREGTL IITEGTFPSP Q-SGGYIXIAP — GIWS — EE " 

MRA THPGNI PNKEWAAVYY GQRAQRPGTM IITEGTFISP Q-AGGYDNAP — GIWS — DB • 

E-RA SKD-HI PS-DLQLNYY NARSQYPGTL IITEATFASE R-G6IDLKVP — GIYN — DA ■ 



QREAHR 

QAAGWK 

QTKRWK 

QIAAWK 

QMVEWT 

QIKBWT 

QVREWK 

QAKSWK 



BIV-SRVHSK 
RW-DAVHEQ 
KIN-DEIHAN 
KIT-DAVHAK 
KIF-NAIHBK 
KXF-KAIHBN 
NIP-LRXHDC 
KIN-EAIHGN 



KC-FIFC— Q 
GG-YIYC— Q 
GS-FSSM — Q 
GS-FIFC — Q 
KS-FVWV — Q 
KS-FAWV — Q 
QS-FAWV — Q 
GS-F5SV — Q 
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5EQ 3 
5BQ 6 
SEQ B 
SEQ J.0 
SEQ 12 
SEQ 14 
SEQ 16 
SEQ 19 
SEQ 22 
SEQ 24 
SEQ 27 
SEQ 30 
SEQ 33 
SEQ 35 
SEQ 38 
SEQ 40 
SEQ 42 
SEQ 44 
SEQ B3 
SEQ 85 
Bacteria 
T44612 
NP_625402 
NP_295913 
AE320254 
OYB fanu-ly 
Af4875 
A£4961 
Ca2460 
NC44S2 
ScOYEl 
SC0YE2 
SCOVE3 
A36990 



SEQ 3 
SEQ 6 . 
SEQ a 
SEQ 10 
SEQ 12 
SEQ 14 
SEQ 16 
SEQ 19 
SEQ 22 
SEQ 24 
SEQ 27 
SEQ 3D 
SEQ 33 
SEQ 35 
SEQ 38 
SEQ 40 
SEQ 42 
SBQ 44 
SEQ 83 
SEQ 85 
Bacteria 
T44612 
NP S25402 
NP]^2 95913 
AF3202S4 
OYE family 
Af4B7S 
A£4»61 
Ca2460 
HC44S2 
ScOYEl 
SC0YE2 
ScOyE3 
A36990 



LAHAGRKATT VAPW— 
lAHAGRKRST VAEW— 
lAHAGRKASA VAFW— 
lAHRGRKnSK IMW- 
LGHGGRKASG Q?CF— 
LNHAGRKIVB GVPF— 
LAHRGRKAST KAPW— 

INHPGRQSEM GRGT 

',tAHAGRK7\ST KRPWHDSFTP 

LftHAGRKAST TAPY 

LAHAGRKRST lAEW— 

ISHAGRKASC VSPW 



*********** 



— ISFS AlAXBKVGGtf 
-LSAN DTASEKMGGff 
-lAAQAGKSS L1GU3ESVGGW 
_-^.^IMNKG IVATEKVGGW 

MLB QV7U3KSVNGF 

— QQIQHGVf 

HYQRGKS BLAGPEQGG» 

RGIW 

S6EYKPREGL QWGPEYGGff 

RG-Y TVMEAQGGtf 

ITBAR6K AIAQESENGff 

I.SVN AVSABBVGOT 



PADWGPSGG E- 
PDRVIGPSTV P~ 
ADKAVAPSAL A— 
QEHCVGPSTE P- 
PENVHAPSAI S- 
E-KAVAPSPV 

PDDVWAPSAJ S 

ENDVYGPETN 



261 

— EABPFAKP 
— FTVKMPVP 
SPEEDAYWVP 
— FHETFPTP 
— FRPNGNIP 

FSDSHNTP 

-YNEBTFPFP 
-LVLGBAFVP 



-MTXftBIE 
-MTKQDIE 

-'ISTAEVR 

MTKDDIE 

VENE LTKDEIK 

RE LTVNEIN 

KE MTVEQIH 

RLLSKVtFGT PRELTVAEIK 
- — ^MTVEBIB 
LTEKQYD 



IGHAGRKAST WPW LDRK NTAF?- 

LSHAGRKASC VSPW LSIN AYAAKEVGGW 

UWAGRKAST VAPW LSGG DVAGEDVNGW 

-IiFS KRAGKEAGSVr 

— QHPISASD VQLKQEH 

U3RG LAAEKAAGGW 

YRGBKKQ KFVTQEEGGH 



VGHFGRQflRG SVQ 

IGHRGRKASC VAPW 

LAHAGRKASD WSPF 



QFKK-DWVAA 
DLKT-AWVAA 
QWA-AFAKS 
QFKR-DWFDA 
RWK-DFGAA 
SIVE-DFANA 
EIiVE-AWKAS 
DIV-QKFAVT 
GLVT-SrVDA 
ELVD-KFWA 
WVK-KFAES 
QLKS-DYVEA 



•FTKEDIE ELKN-DFLAA 

■ MSLDDIE AFKK-AFGBA 

PBDWGPSGG EDFTWDERSS SDPSGGYYAP RE LSVBEIK EMVQ-DWATA 

FGSKFGVP RP — • ATKEDIK AVIE-GFAHT 

-FAPGYPTP RA ITLBBIB QLKE-DFVSG 

YAQGHVTP RA LTTEDIN KLQD-KFVQS 




PDNIVMSAI 
PQDVWAPSAI 



— WNEKHAVP KB- 



PDDWGPSNE P 

PDRWAPSAI A — 



lAHAGRKASA NRPW 

LAHAGRKAST AQPW 

lAHAGRKAST YAPW 

LGHAGRKGnr KUW— 

LWATGRAADP DVLA — 

LWHAGRATIP Qt'ITG 

LWYLGRVANP KDLK 



-EGDD HIGRDDARGW 
— RGG APVGADAYGW 
— RGK GAVPAELGGtr 
-EG IDBPLEASKH 



— ETIAPSAI A- 
— QPLAPSAL A- 
— QVIGPDEH S- 
— ELISASEL r 



-ESRBLPNV PRA 

-EDERHPVP TE- 



FHDLFPTP AM 

- — YLPHSQVP RA 



LWSLGRAANP BVIA— 
LWVLGWAAFP DNLA— 
IiWVLGWAAFP DTLA— 
LWSLGWASFP DVIR— 
LHYIiGRVANA KDLK— 



DMK — D LISSS-AVPV EEKGP- 

, — SPAVSAS ATWDSPTEC YSHPP 

. DAGLPL IGPSA — VYW DEESE 

■ — KEGGLK LKSSS-AVPM EEGAP— 



-RDG-IJl YDSASDNVEM DAEQE 

.-IIOB^XA YDSASDNVYM NAEQE 

-RDG-LR YDC3\SDRVYM NATLQ 

-DSG-liP LIAPS-AVYW DENSE 



-VGST BPVRYADHPP lE- 

KIAKSVGNBL RE- 

— VP EE- 

. AKAKKANNPQ HS 

' EKAKKANNPQ'HS- 



■ EKAKDANNLB KS 

. KLRKEAGNEL RA 



> MTLDDZA RVKQ-DFVDA 

. LTVPQIQ EAVG-RPADA 

. MGADELR GWD-AFSAA 

• MTRDDME RVRN-DFVRA 

LTEDEIQ QCIA-DFAQA 

MIP-HIi l^IRDYCKA 

' I/TEKEID HXVEVEYPNA 

-MTVTffilK BRVA-BYAAA 
-LTKDEIK QYIK-BYVQA 
-ITKDBIK QYVK-EYVQA 
-LTKDDIK QYIK-DYIHA 
-LTEEEID KIVEVEYPNA 

11 391 



TKRAXAA-GA 
VKRAVKA-GA 
ARLAVQA-GV 
CKRAIAA-GA 
ARRAVEISGF 
AWRAVBISKP 
AQRALKA-GF 
ARITAEA-GF 
AKRAIEA-GV 
AKRAVEI-GF 
AKRSNRA-SF 
AKRAIHA-GF 



DFVEIHHAHG 
DFIBIHNAHG 
DVIEIHGAHG 
DFIEIHNAHG 
DAVEIHGAHG 
DAIEIHCANG 
DLIEIHAAHG 
NGVEIHAAHG 
DIIBIHGAHG 
DVIEIHGAHG 
DVIEIHAA— 
DVIEIHRAHG 



YLLSSFLSP- 
YLLMSFLSP- 
YLINEFLSP- 
YLLSSFLSP- 
YLINEPYSP- 
CLIHQFLSK- 
YIiISEFLSP- 
YLLAQFLSK- 
YLITBFLSP- 
YLISSTVSPA 



-AANNRTDQY 
-AVNTRTDBY 
-VTNKRTDAY 
-SSNTRTDEY 
-ISNKRTDEY 
-LTNKRADQY 
-ISNQRTDQY 
-KTNRRGDBY 
-LSNKRTDKY 
FTTNDRNDKY 



G-GSFBNRIR 
G-GSFBMRIR 
G-GSFENRTR 
G-GSFENRIR 
G-GSFENRTR 
G-GSFBNRVR 



G-GSFBMRTR 
G-GTFEKRIL 



LSLEIAQLTR 
LSLEIAKLTR 
IVREVAAAIR 
LSLEIAQVTR 
FLKEVIDSVK 
FLLQIIENIK 
VIiRSIISAVR 
IVGBIIKECR 
VLIDllKAVR 
FPMEWHSVR 



DRVGPKVP ^VFLR 

ENVPKISUP VFLR 

AVlPEOiP LFLR 

DAVGPNVP ^VFLR 

SSIPNDVP VFLR 

RKIBT~P IFLK 

SVlPSraiP LFVR 

RQVTEAVGBB EAKKFWGIK 

AVIPEEM PLPVR 

KAIPDSMP LFYR 



ISAS-DWCE ETLPEQ 

VSAT-DWLB BVQPNKP 

ISAT-EWLE GQPVAAESG 

VSAT-DWIE BTLPEB 

ISAA-ENSP DPE 

FPMS-DNCS DPE 

VSAT-BWMB YTGQP- 

LNSA-OHQA 6RDGKEEEE 

ISAT-EWMB- YA6EP 

VTAT-DWLP KGQ — 



YLLHQFLSP ^VSNQRTDEY " 

TDEY G-GSFBNRIR WLEILDLIR AAIPETTP — - 



-VLVR VSAT-DWFBF DSQFKDBFPE 



.KRA.RA-GF DVIEIHAAHG Y.LHQFLSP- 
VKRALKA-GF DVIBIHKAHG YLLHBFICL- 
AKRAVKA-GV DVIEIHGAHG YLIHEFLSP- 
AEYLEKA-GF DGIELHAAHG YLXAQFLSE- 
VRRAVEA-GP DTIDFHFAHG YLVSSFLSP- 
ARWAFEA-GY DYVBLHSAHG YLMHSFLSP- 



-VSNQRTDEY G-GSFENRIR WLBII-— 

-RATPGPTST G-GSWENRTR LTMESRRPCP QH7 

-ITNRRTDSY G-GSFBNRTR LLIEIVTAVR AAMPSSMP LFLR ISST-EWMS- 

-TTNQRTDBY G-GSLBNRMR LILEVTAEVR RRTSKNF ILGIK INSV-EFQB- 

-ATNKRTDKY G-GSFENRVR LALEIVEAAR AVMEEDMP LFTR ISGT-DWLE- 

-LTNQRTDEY G-GSLENRAR FLLNVARRIR QEFPNKG LWVR VSST-DWAD- 



-DTDIGKKFG 

KG 

— NNPBYEGB 



ARRARDA-GF EWIELHFAHG YLGQSFFSE- 
ARRALAA-GF EIAEIHGAHG YLIHEFLSP- 
ABRAQUA-GF DAVEVHAnHG YLLHQFLSP- 
TRMAAEA-6F DILELHCAHG YLLSSFLSP- 



-HSNKRTDAY G-GSFDNRSR FLLETLAAVR EVWPEMI.P MAR FGVL-BYDG- ■ 

-HSNQRTDAY G-GSYANRTR FALBWDAVR BVWPDDKP ^LFFR VSAT-DWIE- ■ 

-LRNTRTDDY S-GSFENRTR LLLBWRAVR HVWPAKLP LFVR LSAT-DMAE- • 

-LTNRRTDEF G-GDLENRAR FPLBVFKfiMR AMWPTNRP MSVR LSCH-DWFP- ■ 



,-GF DGVEIHGANG 
-GP DGVELKAGNG 
.~GF "DYIEVHSAPG 
k-GF DGVEIHGANG 
,-GA DGVEIHSANG 
,-GA DGVEIHSANG 
i-GA DGVEIHSANG 
L-GF DYVEIHGAHG 



YLIDQFTQK- 
YLPEQFLSS- 
YFLDQFLNP- 
•YLIDQFLQD- 
YLLNQFLDP- 
YLLNQFLDP- 
YLLNQFLDP- 
YLLDQFLNL- 



-SCNHRQDRW G-GSIENRAR FAVBVTRAVI BAVGADR ^VGVK 

-NVNKRTDEY G-GSPEKRCR FVLELMDELA ATVGEDN LAIR 

-ASNKRTDKY G-GSIENRAR LLLRIIDKLI GIVGAEK lAVR 

-TCNQRTDBY G-GSIENRSR FAHEWKAW BAVGABK TGIR 

-HSNTRT0BY G-GSIENRAR ETLBWDALV EAIGHEK ^VGLR 

-HSHHRTDBY 6-GSIEMRAR ETLEWDAW DAIGPEK 

-HSNKRTDBY G-GMENRAR FTLEWDALI ETIGPER 

-ASNKRTDKY GCGSIBNRAR LLLRWDKLI EW6ANR 



^VGLR 

^VGLR 

LALR 



LSPY-SQYL- 
LSPF-GLFN- 
LAPW-SSFL- 
LSPY-5TFQ- 
LSPY-GVFN- 
LSPY-GVFN- 
LSPY-GTFN- 
LSPW-ASFQ- 



QARG 

OIEIEG 



SMSGGA 

SMSGGA 

SMSGGA 

OffilEG 
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401 


411 


421 


431 


441 














3 . 


SWKSEDTVk- 


FAQSLVK— Q 


GAVDLIDX5S 


GGVLftQQ 


KI 


6 


SWRGVDTVR- 


-FAKILA.-ET 


GYVDVXDVSS 


GGTHSEQ 


HI 


B 


SWDM-QSSL- 


EEVKKI.P — E 


WGEDLVDVSS 


AANHKDQ 




10 


SHXLSDSVR- 


EAEAUUV— -Q 


onxDLiovss 


GGVHAftQ 




12 


AWIED5KK- 


-lADILV — B 


K6IALVDVSS 


GGHDYRQPP- 


RSGISK 


14 


flWSTEDALK- 


-LADLVI— D 


LGVKVIDVTS 


GGNVRHCKS- 


RYLXWD 


16 


swDLQgri — 


£LAKI1*P — D 


LGVDIXDVSS 


GGNNKDQ 


KI 


IS 


TDTAEEVliK- 


-QIELFE — Q 


WGIDFVEVSG 


GSVBDPQM7\N 


GPKPEKSSRT 


22 


SWDLEQSTQ- 


-LAKLLP— D 


LGVDLLDVSS 


GGNSVJVQ 


KI 


24 


GWEIEDTVAT 


TIAARLR — D 


GGVDLXDV5S 


GGNHKDQ 





KI KSGPAFQVPF AVAVKKRVCT — -KLIVAAV GAIT 

AIAVKNAVGD — KLAVASV GMIK ■ "-- 

AGQIRQAI RAAGAST IiVGAVGLITD SEQARGLVQG 

AVAIKKAVGD KLLVATV GTIT 

SRAIKQHVGD KLIiVSCV GGLB 

ARKLKSKIBN RCLIACS GGLD— - 



1 AEQIRAAVHE AGKQLLVGAV GLVT- 

r AKIIRTK -FPICLPUIVT GGFR— 

J AAKIREAVGD RLLIGAV GNIN— 

•■ AEKIKDQVMG ILIiGAV GMIR— 



— SA EIAKBTVQBK 



SEQ 27 
SEQ 30 
SEQ 33 
SEQ 35 
SEQ 38 
SEQ -40 
SEQ 42 
SEQ 44 

SEQ as 

SEQ 85 

Bacteria 

T44612 

NP_S2540Z 

NP_295913 

AF32Q254 

OYE family 

Af4B75 

Af4951 

Ca24e0 

NC4452 

ScOYEl 

scoyE2 

ScOYE3 
A36990 




SHDVESTIK ISKILA — D LGVDLLDVSS GGNHPQQ 

EKP-EEAVQ- LCEALERAGM — — DFVETSG GTYBSFG— — 

TWTLEQSIK- -LRHQLA— D RGVDVLDVSS GGXHKMQ 

SWTVDQTVE- LAKMLQB ARVDLIDVSS GGLVPFQ — - 

EQTLEESI— BLARRFK — A GGLDLLSVSV GFTIPBT 

SHTPDDTVR SARDLE — A HGIDLLDVST GGNVPRV 

GWDLEQTVQ LSKLLK — Y BGVDVLDISS GGLTAAQ 

GNTADDAVA lARLFK — E AGADIIDCSS GQVWKGD 



■KI NMFNT ■ 



— EAHRKBSS RKRENYFXBF ABVIRKAVXK ■ 
•KV AAGPGYQAPL AKAIKKSVGD • 



-MWYTTG i 
-KMLISTV SSIK— 



Kt TVGASYQLFG AKAVRDALAK — IBPDBSKR MLVCT.— 



NI PWGPAFMGPI AERVRREAKL 

RI PTGPGYQVPF AARVKAGST- 

— QI BVGPGYQVPF AAAVSRAETE 

Qp VYGRMYQTPF ADRIRKEVGI 



PVTSAW GFGT 

LPVAAV GLIT 

ISVMAV GLIB 

PTIJWG AXSE ■ 



EL — ^VPQEEY 
EQR-VETMTF 

EE IHSY 

DLIP—QFED 
ETGIVAQYAY 
ETGIVAQYAY 
BPGIIAQYSY 
BE IHSY 



LIA QM 

LCBSX^OCAHP 
ILQfiLQQBAD 

VIRKIN 

YAGELSKRAK 
VLGELERRAK 
VLGELBKRAK 
ILQQLQQRAD 



RRLDVAYLHL 
— NLSYVSF 
NGQQLAYVSL 
-GFGLAYLHL 
AGKRLAFVHL 
AGKRLAFVHL 
AGKRLAFVHL 
NGQQLAYISL 



501 



511 



521 



ANSRWL 

lEPRYE 

lEPRVIG 

TQSRVAGN — 
VEPRVTNP — 
VEPRVTNP — 
VBPRVTDP — 
VEPRVTG 

531 



QIFSYBEKD 

ilFDASL EDQKGRSNEF 

MDVQP EEDEE-mAF 

FLTEGE GEYBGGSNDF 

FLTEGE GEYNGGSNKP 

SLVBGE GEYSEGTMDF 

lYDVSL KDQQGRSMBF 



VEVRVWG-Q- 

NFLRSWG 

AYKYWKG 

AAKLWDQ 

VYSIWKG 

AXSIWKG- 



— SS-PILIA 



GYJ5- 



AYKZWKG ■ 



LSDVDLSSFR KIFGTTPFPS ' 

NFVBA GNYT— 

PLLIA GGLT— 

PVIRA GNFA— 

PIIRA GMFA— 

. PIIRA GNYA— 



-NFIRA GNYT- 



541 



551 



SEQ 3 
SEQ 6 
SEQ 8 
SEQ 10 
SEQ 12 
SEQ 14 
SEQ 16 
SEQ 19 
SEQ 22 
SEQ 24 
SEQ 27 
SEQ 30 
SEQ 33 
SEQ 35 
SEQ 38 
SEQ 40 
SEQ 42 
SEQ 44 
SEQ 83 
SEQ 85 
Bactccia 
T44612 
HP 625402 
NP"295913 
AF3202S4 
OYE family 
Af4B75 
Af4961 
Ca2460 
NC4452 
SeOYBl 
SC0YE2 
SCOYB3 
A36990 



~NGKQ~AN 

SAHLANS 

ADEATATiEAM 
— NGKQ— AN 

KDPELLN 

RDIFKLD 

E-DGRVTIQR 

TRQC^ 

— TADI — AR 
— DGLFTTAN 



QILEBQD IDVALVG RGFQKDPGLA "WTFAOHLGV 

j^I^Bjgje LDLVIiVG RGFQKNPGLV WAHADBLNV 

j^gpEPK ^ADAILIA RQFLREPEWV FSTARKLGV 

KiiLBBBG LDVALVG RGFQKDPGLA WTFAQHLDV 

KYISBer EDLHLIG RGFLRNPGLV WBFADKLGV 

BFIANGD- FDIALIG KGPLKNTGLI SRIADQLQA 

ENGAKTR ^ADMVLVA RQFLKEPEFV LTVADELGV 

^yy^SOD CDMIGIG RPAIINPSLP ANLILNPEV 

DWDEQGAEK VAEAKQTHDT lEWSESHGG KTKADLVLIA RQFLKEPEFV LRTAHNLGV 

EILESGK ADVTFVA REPLRNPSLV LDSANQLGE 



EISKRK QIRWGFTBRG 

BISMAM QXBHGFSRRG 

PVTVPV QFGRAI 

. EIAMAS QIRWGFTRRG 

. RLHQAL QLGWGFWPNK 

. QFRTAP QYKLALS 

. DVKAPV QYLRGPLSSR 

. FOAEAR LFOKKBAEPH 

. NVQHPH QYHRAVWRIOS 

■ NVAWPV QYDYAVKGHR 




— VGAM-VDA LQGVDG 

— IGTL — AE EIIAGG — — 
— VGMM — EG SYDSPNG 



-PQLRE AALQANQ ' 

-BPG CAE KIIANGB 

— TGA— QAE AILQAGD 

„ — ;U) — HAN SIIAAGR 



AASAEKVTEQ 
AGGWDQSNSW 
-YDAEEFRTL 
-PETAK-HLV 

-LHP BW 

-LHP EW 

-LHP EW 

-YDAPEFKTL 



MAAATYT 

GVLEEGR 



DRBFPEK 

REEVKDK 

REEVKDP 

REQVKDP — 
INDLKND 



-XGIG RAAGSEPDLA KDJIACacVSS IIKYAMGBDE FVlQtTACSA QIBIMBKGEB 

-ED DTPLDLVASG RLFQKNTGLV HSMMDLNT SIQIAH QIAWGFGGRA 

QDRSQIG KLAEQSIQSG ECDAVLLAR GLMSYPS MTEDASVALM GTRAAGNPQY 

ZI„ I~LDLVSVG RAHLADPHWA YFAAKELGV EKASWT LPAPYAHWLB 

ADAVLLG RBLLRNPSWA QHAARELGV DARMPD QYGWGM 

^ADLIALG RPFLRDPHWA <IR7U\RELGL RPVSXD QYARAGW 

-ADLCAXA RPHLRDPAWT LHEAAKIGF GBVHWP KQYRSARGQY 

'■ I-^AIAFG RYFXSTPDLP FRVMAGIQL QKYZBUV SFYSTLSREG 

. YDALLYG RYFTSNPDLV ERLRKGIPF TPYDRS RFYGPFEDNA 

. RTIVGFA RFFTSNPDLV EKLKLGKPL NHYDRB EFYKYYNY-G 

. DWATFG RHFISTPDLP FRIKEGIEL NPYDRD TFYKAKSPDG 

. RTLIGYG RFFISNPDLV DRLEKGLPL NKYDRD TFYQMSAH-G 

■ RTLIGYG RFFISNPDLV DRLEKGLPL NKYBRD TFYKMSAE-G 

. RTLIGYG RFFISNPDLV YRLEEGLPL NKYDRS TFYTMSAE-G 

. RSIIGFS RFFTSMPDLV EKLKLGKPL NYYNRE EFYKYYNY-G 
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SEQ 3 
SEQ 6 
SEQ 8 
SEQ 10 
SEQ 12 
SEQ 14 
SEQ 16 
SEQ X9 
SEQ 22 
SEQ 24 
SEQ 27 
SEQ 30 
SEQ 33 
SEQ 35 
SEQ 3B 
SEQ 40 
SEQ 42 
SEQ 44 
SEQ B3 
SEQ B5 
Bacteria 
T44612 
NF 625402 
NP*"295913 
AE320254 
OYE family 
Af4a75 
A£496I 
Ca2460 
Nc4452 
SCOYEl 
SC0YE2 
SCOYE3 
A36990 



GTPYIDPSVy KQSIEDV— 
AGPYLRKKLB KI- 



GTPYIDPKnX KESIEB- 
QQXVDLZSRT SXLEVN— 



WIVEKLGMKS IVGRGVEVTW WSELKKLBK F— 

ftRI 



PFDISNADBV flRVTQlMREG KV-' 
KKNAPKLVL- ■ 
HRVHV3«K 



ETNLQRAARA VAGK 



YLDYEESAEY 
KCYVDYPEKT 



MMJiNFPV— 



YXDQPFSKEF 
YIDYPTYEEA 
YIDYPTYEEA 
YTDYPTYEEA 
YHSYDESBKQ 



ASS 

VIGKPLV 

EKVYGAQR- 
LKLGWDKK— 
LKLGWDKN— 
VDLGVfNKN— 
VXGKPLA — 



Figure 1. A multiple alignment of the 2 031 OR amino acid sequence 
from A. fumigatus (SEQ ID No3) along with related 2031 ORs from 
other fungi and bacteria (see Example 4) and OYEs . Regions 1-11, 
marked with * or #, refer to amino acids conserved between ORs 
but not OYEs. 

Fungal 2031 ORs are given by the following SEQ ID No.: A. 
fumigatus, SEQ ID Nos . 3, S and 8; A. nidulaxis , SEQ ID No. 10; C. 
albicans SEQ ID Nos. 12 and 14; N. crassa, SEQ ID Nos. 16 and 19; 
M. grisea SEQ ID Nos. 22 and 44; S. pomhe SEQ ID No. 24 
(NP_595868) ; C. trifolii SEQ ID No. 27; F. sporotrichioides SEQ 
ID Nos. 30, 33 and 35; F. graminearum SEQ ID Nos. 3 8 and 83; M. 
graminicola SEQ ID Nos. 40 and 42; U. maydis SEQ ID No 85. 

Bacterial ORs resembling 2031 are: 

T44612 {Pseudomonas putida) , SEQ ID No. 86; NP_625402 
{Streptomyces coelicolor) , SEQ ID No. 87; NP_295913 {Deinococcus 
radiodurans) , SEQ ID No. 88; AF320254 (Azoarcus evansii, SEQ ID 
No. 89. 

Fungal ORs similar to the Old Yellow Enzyme family (originally 
identified in S. cerevisiae) : 

A fumigatus, Af4875 and Af4961, SEQ ID Nos. 90 and 91 
respectively; C. albicans, Ca2460 and A36990, SEQ ID Nos. 92 and 
93 respectively; N. crassa, Nc4452, SEQ ID No. 94; S. cerevisxae, 
OYEl, OYE2 and OYES, SEQ ID Nos. 95-97 respectively. 

Details of the sequence searches that identified the ORs other 
than SEQ ID No. 3, and methods for the^ construction of multiple 
alignments are given in Example 4 hereinafter. 



SEQ 1 
SEQ 2 
SEQ 4 
SEQ 5 
SEQ 7 
SEQ 9 
SEQ 11 
SEQ 13 
SEQ 15 
SEQ 17 
SEQ 18 
SEQ 20 
SEQ 21 
SEQ 23 
SEQ 25 
SEQ 26 
SEQ 2B 
SEQ 29 
SEQ 32 
SEQ 34 
SEQ 36 
SEQ 37 
SEQ 39 
SEQ 41 
SEQ 43 
SEQ 82. 
SEQ 84 



SEQ 1 
SEQ 2 
SEQ 4 
SEQ 5 
SEQ 7 
SEQ 9 
SEQ 11 
SEQ 13 
SEQ 15 
SEQ 17 
SEQ 18 
SEQ 20 
SEQ 21 



SEQ 36 
SEQ 37 
SEQ 39 
SEQ 41 
SEQ 43 
SEQ 82 
SEQ 84 
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l^^l J^^^^l 7ic^^ Gc '^^^ EEEf !^EEE !f 



101 111 121 - 131 



^i^KC^;; c^^;^^ Ssicic^I^ iliicc^^ -cL^'oIZIt c^cct^ ?c?S? 



SEQ 23 IIIIII IIIIIII CGAAA CCXCGACCCA AACAAACAGC 

SEQ 29 



IIS 34 'ri'c^V^'^ 'ciillZV^ ^GGCci^G icil^TTTT ^T^TACCT *T»C^^TT «ITCACT,CI ATGCT^CTCA t^TCBAICCG 



X iciai^c^^G ^cG^;^ G^;;^; tsSj^E^ iiii^^ j^sj^ s^^Sc"^ sjs^? 

SEQ 2 TCTGTGCCTS GCGC&GRCGG GiaiiiaAAT AMWaTCACC GCACCGAGGft. GWTCTTACC TGTCGCARCC 

SEQ 9 ~ I II IIIIIII IIIIIIIIII III ATGaCAG TTCCATACCA 

SEQ 13 ;; IIIIIII IIIIIIIIII IIIIIIIIII IHIIIIIII IIIIIIIIII IIIIIIIIII I ^H. TGGCCSACTT 



!l i -ci-^:^ i^ioc^c^ iccic^c^s c^iiGccic iicccccG^ =EEE EEf^f 

il§ ^GciGi;^ -^-^^i EEE^ EEE!^ E^^*!?^ ^Ztl f!!-^-- !:^!!^!^5t 

ii I ii^GA^^ A^;:;;cccc"i ^EiE!! fEEE ^^EE !EE!E TEEzE E^^EE t'^.^^it 

SEQ 39 
SEQ 41 



tl IIIIIIIIII 'Ilimilll IIIa^^Ig ^T^^ ^^^C C CAGGCTA.TC CCGTCCTT.C ^CC^T GGCACTGAAC CCCAAGACGC 



r 
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*-***-*"*-*■*■♦* — — — — 



L -CTCTTCCa ^^C. ^^CCTC-~. 

IIS i= I^c^^S cccccGcrGC cccgggigti ^cmciaca ccccggccov cgtccccgcc gccggcactc ccctcccctc ^ccccc--. 

SEQ i7 _ ATGGCTACTT CCACTACCTC CGACCTC 

20 Grcl^ ^ciTT|, G^C|C C|||^G CCCC^CCC. 3G.GCCGCCG GCXGGG.CCC CT.TGC.3C. GCJGG^G-- 

11° 1^ S?I^T S^Ss S§??^G ScAAa «^TACC«.C TGG^GCGGCG ATA3GTGTAC CGCA^^--- 

iii mm^^mm=^mm 

II iiiiiiipiiiiii^^PIp; 

SEQ 41 IZIII^^III I.II_Iatgt ccccaccacg cttcgaagcg gcccctgccg acccctcacc gctcggc 

i iiii s = ^?c^ =^ ^-^s 



SEQ 1 GG--— CC AGAAGATCCC CAAGCTCTTC ACGCCCTTGA CCATCCGTGG CGTCACC- ------- ----- ^J^^^c CGcS^GGT- 

SEQ 2 GG CC AGAAGATCCC CAAGCTCTTC ACGCCCTTGA CCATCCGTGG CGTCACC--- ------ i^TScAAT CGCATTGGCG 

SEQ 4 GG AT CGGCACCTCC CAAGCTCTTC CGGCCGCTTT CGGTGCGGGG TCTGACC— . - tTTCACAAT CGCATTGGC- 

sIq 5 GG -AT CGGCACCTCC CAAGCTCTTC CGGCCGCTTT CGGTGCGGGG TCTGACC--- ZZZ SS^G^^C CGCATCATG- 

SEO 7 GACCCC TACGCTCTTC CGGCCCTTAC AAATCCGCftA TGTGACG TT-rrACAAC AGACTTGGC- 

11^ I OG CA ATGCCGTCCC CAAGCTGTAC ACACCTCTGA CGGTGCGTGG GGTGACC-- ^^^^^^ I"::::::: I^^Sc ™gGG- 

I- - -::::::L^ ^^21?.^'. ^^.^ S -.c a.tacc.^t a™^^^ 

III \l G gcgatgtccc tactctcttc acccctctca agatccgtgg tgttgag 1:::::;^ ^C^C^S^C CGCCTCG^C- 

SED 1? AAACTCTCC CAACCCCTCA CCCTCCCCAA TGGCCTT" ^C CCTCCCCAAC ^GCCac 

is AAACTCTCC CAACCCCTCA CCCTCCCCAA TGGCCTT ^AC CCTCCCCAAC CGCCTCGTC_ 

20 CCATCCC AACGCTGTTC AAGCCTCTGA AGATCCGTGG CGTCGAG IIi::::: IcJScS^C CGCT^^GGC- 

Si5 21 CCATCCC AACGCTGTTC AAGCCTCTGA AGATCCGTGG CGTCGAG ' -S^TAAC AGAATGTTT- 

23 C AAAATTATTT ACTCCTCTTA AAATTAGAGG AGTGGAG -ItS^S^ ^CATCTGG- 

III II cTcc gacgctcttc aagcccctcc gcatccgcga cctcacc :::::::::: Ia^S^S^ S^gg- 

iS 26 CTCC gacgctcttc AAGCCCCTCC GCATCCGCGA CCTCACC -TT^CCAAAC CGCCTCTTC- 

slg 28 -AAGCTATTC ACACCCATCA CCATCCGCGG CGTCACA iJ^ccS^ CGCCTCTTC- 

29 ~- ^AAGCTATTC ACACCCATCA CCATCCGCGG CGTCACA- :^3^CCAAAC CGCCTCTTC^ 

III 34 gg":::::: ;;;^cc;Ai;^ icici;;;;i ZkTilT^ cci^ci": :cttcaaaac cgg^^^tt- 

III 36 -AGGTTTTC ACACBCATCA CCATCCGAGG CGTCACA Zlll?^^ CGTCTCTTT- 

SEQ 3? -AGGTTTTC ACACBCATCA CCATCCGAGG CGTCACA- y - IcTCCA^C CG^StATG- 

CCTCA AGATCCGA6G TCTTACC CTCCAGAAU 



SEQ 39 



II :==:^c csci^; ^^^^ g^k^^gtcg r — : z^^^^^ cgtSSgg: 

fcS^ ci^i- ^gc^c^x- 



SEQ 84 CCAGAGGGTG TCAAAAflACC 
501 511 



521 * 531 541 551 561 571 581 591_ 



————""——— —————— —————————— ——————— — — — — CTCGC GCCCCTCTGC 

SEQ 1 TAAGTCCGTT TGCCCTTGCT CATATCGACG AAAGCTAATC CCCCGTCAG IIIZIIIIII CTCGC GCCCCTCTGC 

I ;^;g^; c^^^^ii; 'tiiiiiTc^ vd^T^i^ iccciiG^ ^^^^ ^.^it ^.^ii i^lllll 

SEQ 5 GTGTC GCCCATGTGC 

SEQ 7 ~ ZZZ IIIZIIIIII II CTCGC GCCCCTCTGC 

SEQ 9 " II I GTATC TCCAATGTGT 

SEQ 11 I IIIIIIZ GTTTC ACCAATGTGC 

SEQ 13 ' ZZZ ZZ GTTGC GCCCATGTGC 

SEQ 15 ZZZZZZ ZZ ^AAAGC CGCCATGGCC 

SEQ 17 ZZZ ZZZZZ-ZZZZ AAAGC CGCCATGGCC 

SEQ 18 ZZZ ZZZZZ GTCTC GCCCATGTGC 

SEQ 20 - — ; ZZZIII-II IIIIIIIIZZ GTCTC GCCCATGTGC 

SEQ 21 ' ZZZZ ZZZZZZZZ-Z GTTTC GCCCATGTGC 

SEQ 23 ZZZZ ZZZZIIIZII GTCAG GCCCATGTGC 

SEQ 25 ZZZZZZZZ ZZZZZZZZZ- - GTCAG CCCCATGTGC 

SEQ 26 ZZIIIIIIII IIIIIIZ ' CTTGC CCCTCTCTGC 

SEQ 28 IIIIIIZZ ZZZZZZZZZZ CTTGC CCCTCTCTGC 

SEQ 32 " ZZZZZ Z GTCTC CCCAATGTGT 

SEQ 34 ZZIIII ZZZZZZZZZZ CTTGC CCCTCTCTGT 

SEQ 36 "ZZZ ZZZZZ— CTTGC CCCTCTCTGT 

SEQ 37 ZZZZZZZZZZ ZZ ZZZZII - TTGAG G6GGCTCTGC 

SEQ 41 ZZZZ ZZZZZZZZZZ ZZZZZ AACGC GGCCATGTCG 

SEQ 43 ZIIIZZZZ ZZZZZZIZII - GTCAG CCCCATGTGC 

SEQ 82 IZZZZZZZZZ ZZZZZIZIIZ GTCTC TCCCATGTGC 
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SOX 611 G2X 631 «1 651 671 681 691_ 

—2——— " _ _ ***-*■****■** •*.*+*-*•♦**** 

********* 'TlrZrr rrrlcATGAC CGAc"--"- TACCACATCG CCCATCTGGG TGGGftlCGCC CAACGCGGAC 

SEQ 1 CAMACTCCG CC CAGGACG ^CCACATGAC CGAC J^^cACATCG CCCATCTGGG TGGGATCGCC CAACGCGGAC 

SEQ 2 CAATACTCCG CC Sc^TG GaSSJSc TCCC TGGCATATGG CACATCTTGG AGGGMTGCC CAGCGAGGGC 

SEQ 4 CAATACTCAG CC rAC^TG GaSS^Sc TCCC TGGCATATGG CACATCTTGG AGGGATTGCC CAGCGAGGGC 

SEQ 5 CAATACTCAG CC Ii:^^^^^^ gcS^S ISc TACCACCTGG CCCATCTGGG CCACCTCGCC CTCAAAGGCG 

SEQ 7 ATGTACTCCT GCGAGTCGGA CCCGTCGTCT ^CCCACGTCG ^CGCCCTAAC ^C J^^ScATCG CGCACTTGGG AGGTATTGCC CAGCGCGGCC 

SEQ 9 CAGTACTCCG CA— - ^i^S^I^S SSScSc ^C^^- TACCATTTAA TCCATTATGG TTCATTAGTG AATCGTGGGC 

SEQ 11 CAATATTCTG CT ^^^^^ SSaGCCAC TCTG TTTCATTTTG TTCATTATGG ATCATTTGCT GTACGTGGAC 

SEQ 13 ATGTATTCAT CG TCA CCAACTGACA ATCAAGCC^ iCi^ ^ggcaCCTTG TCCACCTGGG CTCCTTCGCC CTCCGCGGTG 

SEQ 15 ACCTACTCTG CC I^SS ScS^^^C S^C CCCGAACTCG CCGCCGTCTA CGCCACCTGG GCCCGCGGCG 

SEQ 17 GAACAAATGG GC llcll^ mScCTGCC SaC CCCGAACTCG CCGCCGTCTA CGCCACCTGG GCCCGCGGCG 

SEQ 18 GAACAAATGG GC ' "iS^S^ ^^SS^ JtCCACTTGG TGCACCTGGG CCAGTTCGCC CTGCACGGCA 

SEQ 20 ACCTACTCAG CC "^^^^^ IcScCTGAC cSc TTCCACTTGG TGCACCTGGG CCAGTTCGCC CTGCACGGCA 

SEQ 21 ACCTACTCAG CC — ~ 5^5^ GGcStTGAC AGAT TTTCACCTAG TACATCTTGG AGCGATGGGA ATGCGTGGGC 

SEQ 23 ACTTATTCCG CT ^^^^ GcScGCGAC cSc TACCACCTCG TCCACCTGGG CCAGTTCGCC CTGCACGGCG 

SEQ 25 CAGTACTCCG CC I-SS^TG GCCA^GC^ CGAC TACCACCTCG TCCACCTGGG CCAGTTCGCC CTGCACGGCG 

SEQ 26 CAGTACTCCG CC f^^Jc gS^ATGCCM TGAT TGGCACTTGA CTCACCTCGG GGGAATAATC CAAAGAGGCC 

Sq II S?ccG cci::::::: :::::::::: :::^?g SgcS^ JS?---;.- tggcacttga ctcacctcgg gggaataatc caaagaggcc 

SEQ 32 'ZZllil gtgtcatgac cccc------ tggcacaaac aacacctggg cagcttcgca gcacgaggtc 

SEQ 34 OAATATTCAG CA- t^GA^G gIStGCTAC TGAT TGGCACTTGA CTCATCTCGG AGGCATTATC CAACGAGGCC 

SEQ 36 CAATACTCCG CC IIlt^GMG gSgSaC TGGCACTTGA CTCATCTCGG AGGCATTATC CAACGAGGCC 

III II S^Sg c^;::::::: :::::::::: I-^gacg ™aSc ISg — tggcatcaca cccacatggg cggcatcatc caacgcggtc 

tl ;=GG CG:::^Gi; ^G^^GCG G^CCGi^ ^CGC^ CCC^^^ '^^^^^ ^-^CG GCGCTGGGGC CAGGGCGAGT 

Ifo II ^^It cg:::::::: :::::::::: iiSS gSS?S ?ccg — taccacattg cgcatttggg atcgtttgcc ctgcacggtg 



721 731 741 



SEO 1 'clll'^ -^'^7.^ «;;^^0G i^ccc| r-'oGccGC SccSc ?ctSc f^lct'-" I-^^lS 

ill 

SEQ 7 caggcctcgt cttcatcgaa gccaccgccg tgcagcccaa c— gggcgc ^ctccccca ^^gactcgg^ tttatggaag GACTCG CA 

SEQ 9 CCGGTCTCAT GATGATCGAG GCAACCTCCG TCTCACCTGA A-~GGCAGA AJCACGCCGC AGGACGTCGG GATGAA-r CA 

SEQ 11 CAGGTATCAC CATTGTTGAA AGCACGGCTG TTTCTCCTGA G— -GGT66A TTATC^CTC ^TGATTTAGG ^A 

SEQ 13 CAGCATTAAT CATTTTAGAG AGTATCTTTG TGTCCGAAAA TCCGGA ^T^CATTC ^TGATTTAGG ^^^^ ^A 

SEQ 15 TCCCCCTCAC CATCTTCGAG GCCACCGGCG TCCTCCCCAA C---GGCCGC A^E^^CCCG CCCCAACCAC CCCGGCACCA CGCCCGAGCA 

SEQ 17 ACTGGGGCCT GATTCTCACC GGCAACGTCC AAGTCGACCA CGCGCACAAG GGCGACGCCC ^^^^^^^ CCCCAACCAC CCCGGCACCA CGCCCGAGCA 

SS i; SSS SSS SSfcS SSSS ™S= S^T^ 

s s ssss; ;553s; ssiss: msss; 'f '-ssi ssss' sss s;s:~ ::=r-s 

i r: = iiS iSSi iiSi EEi SSii^ = = EE? 

II 7o^^^ -C^^C Zi^T^-^ -^^^ CCCCC^COC. aCC^C OCCCAOCCCT 000.00^.00 

II fcSfc?^^ ----- — 

801- • 811 821 831 841 851 861 871 881 -"---~;6 

~ +**+*****-. ******—-- — — ** 

SEO 1 gI^CGCCCCG :::;;g;GCC GGG-TcirCGA CTTCGTGCAC AGCCAGGGC- CA^GATCG GCGTG--- -~CAGCTT GCCCATGCCG GCCGGAAAGC 

SEQ 2 GATCGCCCCG — ATGCGCC GGGTCATCGA CTTCGTGCAC AGCCAGGGC- CAGAAGMCG GCGTG_ ^c^^ATC GCACACGCAG GTCGCAAGGC 

SEQ 4 GATTGAGCCA TTGAGCC GCGTGATCGA GTTTGTCCAC ^GTCAGAAC- CAGCTTATCG GCGTG _^XC GCACACGCAG GTCGCAAGGC 

SEQ 5 GATTGAGCCA ---TTGAGCC GCGTGATCGA GTTTGTCCAC AGTCAGAAC- CAGCTTATCG GCGTG CASA GCCGGAAAGC 

SEO 7 ATTCCTGGGG CTGAAGC GGGTCGTCGA GTTCATGCAC GCACAGGGC- GCCAAGGTCG GGATC r^MfT GCCCACGCCG GCCGCAAGGC 

III I GmSgCGCCC — ATGAAGC GCGTGATCGA CTTCGTGCAC TCGCAGTCC- CAGAAGATTG GCGTG----- ZZZJS^H GGcS^GG^G gSg^GC 

SEQ 11 AGCAGAGAAA — TTGAAAC CAATTGTCGA TTACGCTCAT JCTCAAAAG- ^AATTAATTG CCATC_ _CAA ^^^g GGCGAAAGAT 

SEQ 13 AGCTCACAGT —TTACGGA AAATTGTTGA TTTTATTCAT ^ATCAAGAC- ^^^^^^T GTATA ^ GCCCACGCCG GCCGCAAGGC 

SEQ 15 GATTGCGCCC — CTCAAGC GCATCGTCGA CTACATCCAC TCCCAGGGC- ^AGAAGGCCG ^TATC ^^^g^^^^c AACCACCCTG GTCGCCAGAG 

SEQ 17 GACCGTCACG GCCTTCAAGG CCTGGGCGGA CGCCGCGCGC CTGAATGGC- ^^^^CAAAA ^GCCTGTGGT ^GTG ^^CACCCTG GTCGCCAGAG 

SEQ 18 GACCGTCACG GCCTTCAAGG CCTGGGCGGA CGCCGCGCGC JTGAATGGC ^AGTCCAAAA CGCCTGTGGT ^ GCTCATGCCG GCCGCAAGGC 

SEQ 20 GATCGCTCCT CTGCGCC GCATCGTCGA CTACGTGCAC AGCCAGGGC- CAAAAGATCG ^CATC U qcTCATGCCG GCCGCAAGGC 

SEO 21 GATCGCTCCT CTGCGCC GCATCGTCGA CTACGTGCAC AGCCAGGGC- CAAAAGATCG CCATC pVaTTr GCGCATGCTG GTAGftAAGGC 

III 23 S?^GCCG — TTACGAA GAATTGTTGA ATTTGCTCAT TCGCAAAAT- CAAAAAATTG ^GATT----- —g^JJ^ ^ccS^GCCG S^GGC 

SEQ 25 GATTGCGCCG —CTGAAGC GCATCGTCGA CTTTATCCAC TCGCAGAAC- ^AGGTCGCGG CCATC CAGC GTCGCAAGGC 

SEQ 26 GATTGCGCCG —CTGAAGC GCATCGTCGA CTTTATCCAC J^GCAGAAC- ^AGGTCGCGG CCATC tCGCATGCGG GTCGCAAGGC 

III 3=c? l^.^^c =g^- SS^- -CTG T^GG GTCGCAAGGC 

III II ™Icci IIIcicCGG^ liii™ C^TT^^^^^^ AGC^^^C- GCCAACATTG CXATT--- ----CAGATA GGTCATGC^^ GGAG^aC 

SEO 36 AATCGAGCCC T— TTGAAGC GCATCACTAC TTTTGCCCAC AGCCAAAGCW CAGAAGATTG GTAT »rarrTC TCGCACGCTG GTCGTAAGGC 

III 37 ^TCGAGCCC —TTGAAGC GCATCACTAC TTTTGCCCAC AGCCAAAGC- CAGAAG^TG GTAT- QCgSSgCGG GCCGGAAAGC 

SEQ 39 GATCGAGCCT C-TTGCCAA GGTCGTC-GA GTTTGCCCAC TCCCAGAAC- CAGAAGATCA .TGATT----- °^3TTG eCGCAJrect,^. 

III 43 ItTc^^Tg ;;;;c^gc ^cgccgccgc cgc^gga^ cIcgg^gcI ctcI^cgtc gcgI:::" -ZZZ^tt gSS^gc^g g?Sgc 

i 15 ::S ^gS Tc^^t S^?fG S^gc atgcgggaag 




i. 



r 
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SEQ 1 
SEQ 2 
SEQ 4 
SEQ 5 
SEQ 7 
SEQ 9 
SEQ 11 
SEQ 13 
SEQ IS 
SEQ 17 
SEQ 18 
SEQ 20 
SEQ 21 
SEQ 23 
SEQ 25 
SEQ 26 
SEQ 28 
SEQ 29 
SEQ 32 
SEQ 34 
SEQ 36 
SEQ 37 
SEQ 39 
SEQ 41 
SEQ 43 
SEQ 82 
SEQ 84 



SEQ 1 
SEQ 2 
SEQ 4 
SEQ 5 
SEQ 7 
SEQ 9 
SEQ 11 
SEQ 13 
SEQ 15 
SEQ 17 
SEQ 18 
SEQ 20 
SEQ 21 
SEQ 23 
SEQ 25 
SEQ 26 
SEQ 28 
SEQ 29 
SEQ 32 
SEQ 34 
SEQ 36 
SEQ 37 
SEQ 39 
SEQ 41 
SEQ 43 
SEQ 82 
SEQ 84 



SEQ 1 
SEQ 2 
SEQ 4 
SEQ 5 
SEQ 7 
SEQ 9 
SEQ 11 
SEQ 13 
SEQ 15 
SEQ 17 
SEQ 18 
SEQ 20 
SEQ 21 
SEQ 23 
SEQ 25 
SEQ 26 
SEQ 28 
SEQ 29 
SEQ 32 
SEQ 34 
SEQ 3 6 
SEQ 37 
SEQ 39 
SEQ 41 
SEQ 43 
SEQ 82 
SEQ 84 



CACCACCGTT 
CACCACCGTT 
CAGCACCGTC 
CAGCACCGTC 
GAGTGCCGTT 
TTCGAACATC 
TTCTGGTCaG 
TGTTGAAGGG 
CTCCACCAAG 
TCCGATGGGC 
TCCGAT6GGC 
CAGCACAAAG 
CAGCACAAA6 
TAGCACCACT 
TAGCACCCTG 
TAGCACCCTG 
CAGTTGCGTA 
CAGTT6CGTA 



GCGCCCTGGA 
GCGCCCTGGA 
GCGCCATGGC 
GCGCCATGGC 
GCGCCGTGGC 
GCCCCCTGGC 
CCCTTATTTT 
GTACCATTCC 
GCCCCCTGGC 
GCGGGCACGC 
GCG6GCACGC 
GCCCCCTGGC 
GCCCCCTGGC 
GCTCCTTATC 
GCACCGTGGA 
GCACCGTGGA 
TCTCCCTGGC 
TCTCCCTGGC 



TCTCA— 
TCTCA— 
TCTCG— 
TCTCG— 
TGGCG— 
TCATG— 
TGCAC- 
AACAA- 
ACTAC- 
GGGGA— 
GGGGA- 



-GCGC AGGCGGGCAA 
^AA 



-TTCTCGGCC 
-TTCTCGGCC 
-GCCJ^CGAT 
-GCCAACGAT 
GTCGAGTCTG 
CAAiSGGCATC 
-TTGGAACAA 



ATC6CGACGG 
ATCGC6ACGG 
ACCGCCTCCG 
ACCGCCTCCG 
AAGGCGGACG 
GTCGCGACGG 
GTT6CAGAXA 



CAGCGCGG 

-CTGT GGGAGAAGGC 
-CTGT GGGAGAAGGC 



ACGACTCCTT CACCCCCAGC GGCGAGTATA AGCCGAGAGA 
ACGACTCCTT CACCCCCAGC GGCGAGTATA ^GCCGAGAGA 

GAGGA 

TCACC GAGGCTCG 

TCACC 

TAAGC 

TAAGC 



GAGGCTCG 



CAAGAGCGAG 
GGTGGCGCCC 
GGTGGCGCCC 
GGGCTTACAG 
GGGCTTACAG 

TACACA 

CGGCAAGGCG 
CGGCAAGGCG 
-GTAAATGCT 
-GTAAATGCT 



CTTGCCGGCC 
TCGCCGGTGC 
TCGCCGGTGC 
GTCGTCGGAC 
GTCGTCGGAC 
GTTGCGACTG 
CTGGCTCAGG 
CTGGCTCAGG 
GTCGCGGCGG 
GTCGCGGCGG 



AGAAGGTCGG 
ACAAGGTCGG 
AGAAGATGGG 
AGAAGATGGG 
AGAGCGTTGG 
AiSJUlGGTCGG 
AATCTGTCAA 
— ATACAACA 
CCGAGCAGGG 
CGTTGGTGTT 
CGTTGGTGTT 
CCGAGTATGG 
CCGAGTATGG 
AAGCTCAAGG 
AGAGCGAGAA 
AGAGCGAGAA 
AAGAAGTGGG 
AAGAAGTGGG 



CGGATGGCCG 
CGGATGGCCG 
CGGCTGGCCA 
CGGCTGGCCA 
CGGGTGGCCC 
TGGCTGGCCG 
TGGGTTTGCC 
TGGTTGGCAA 
TGGCTGGCCC 
GGGAGAGGCG 
GGGAGAGGCG 
CGGCTGGCCT 
CGGCTGGCCT 
TGGGTGGGAG 
CGGGTGGCCC 
CGGCTGGCCC 
TGGCTGGCCA 
TGGCTGGCCA 



GAGCACAGTC GTACCGTGGC TGGAC — 
TAGTTGTGTA TCTCCGTGGT TGAGC — 
TAGTTGTGTA TCTCCGTGGT TGAGC — 
GAGCACTGTG GCACCATGGT TAAGC — 



CCGCGGCAGC .GTCCAGCAGC ACCCC 

CTCCTGCGTT GCTCCTTGGT TAG3VC 

GAAGGCCTCG GACTGGTCAC CTTTC 



CGCAAGAAC 

-ATCAACGCT 

^ATCAACGCT 

-GGCGGCGAT 

GACT GCCGAGTAAA 

ATTAGCGC CAGCGACGTG 

GCCGGACTT 

-TACC GCGGAGAAAA GAAGCAAAAG 



1001 

*********** 
GACCCGCGTC 
GAC-CGCGTC 
GGC-CGCGTC 
GGC-CGCGTC 
GCG-GATGTG 
GAT-CGTGTG 
GAC-AAAGCA 
GAA-CATTGT 
GAG-AACGTC 
TTT-GTGCCT 
TTT-GTGCCT 
GAT-GACGTC 
GAT-GACGTC 
AAT-GATGTT 
GAC-GACGTT 
GAC-GACGTT 
GAC-AATATC 
GAC-AATATC 



1011 



1021 
**** ********_ 



1061 



ACTGCTTTTA 
GTTGCCGCTA 
GTTGCCGCTA 
GTTGCTGGTG 
CGCGCCGGCA 
CA6CTTAAGC 
GCCGCTGAAA 
TTTGTGACGC 

1071 



AGGAAGTCGG 
AGGAAGTCGG 
AGGACGTCAA 
AGGAGGCGGG 
AGGAGATG— 
AGGCCGCTGG 
AGGAGGAAGG 

1081 



TGGCTGGCCA 
TGGCTGGCCA 
CGGATGGCCA 
AGGATGGCCG 

TGGATGGCCC 
TGGCTGGCCG 



AAAGGGCCCG GCGATATC 

AAAGGGCCCG GCGATATC 

AAAGGCCCGA CAAATGTG 

AAAGGCCCGA CAAATGTG 

GTGGGTCCGT CGGGCGGG — 

ATCGGCCCGT CCACCGTG 

GTTGCTCCTT CTGCATTG; — 

GTGGGGCCAT CTACTGA& 

TGGGCCCCCA GCGCCATC — - — 
CGCTTGTTGT CGAAAGTG — 

CGCTTGTTGT CGAliAGTG — "*' 

TGGGCCCCGA GCGCCATC 

TGGGCCCCGA GCGCCATC 

TATGGACCAA ATGAAGAC 

GTGGCTCCCA GCGCGATT 

GTGGCTCCCA GCGCGATT — 

GTTGCTCCCT CGGCCATC 

GTTGCTCCCT CGGCCATC 



-GAGGAGC ATATCTTTAG 



^*** 

-CCCTTTGCG 
-CCCTTTGCG 
-CCCTTCACC 
-CCCTTCACC 
TCCCGAGGAG 
-CCCTTCCAC 
-AGACCA2VAT 
-CCATTTAGT 
CTACAACGAG 



**** ********** *****—— 



-CCGTTCTCG 
-CCGTTCTCG 
"-AGGTGGGAC 
-CCTTACACC 
-CCTTACACC 
ACAAGAAAAT 
: ACAAGAAAAT 



GAGCCCTTCG 
GAGCCCTTCG 
GTTAAGAACC 
GTTAAGAACC 
GATGCGTATT 
GAGACTTTCC 
GGTAATTTAC 
GATTCACACA 
GAGACCTTCC 

CTTTTCG 

CTTTTCG 

GAGGACTTTC 
GAGGACTTTC 
GAAAACCACG 
AAGGACTGGG 
AAGGACTGGG 
6GTGTGAACC 
66TGTGAACC 



CCAAGCCCAA 
CCAAGCCCAA 
CTGTGCCGAA 
CTGTGCCGAA 
GGGTGCCGCG 
CCACCCCCAA 
CTGTTCCTAA 
ATACACCACG 
CCTTCCCCAA 
GCACGCCGCG 
GCACGCCGCG 
CGAACCCCAA 
CGAACCCCAA 
CTCAACCTCA 
CCACACCGCG 
CCACACCGCG 
CAGTTCCCAA 
CAGTTCCCAA 



GGCCATGACG 
GGCCATGACG 
GGAGATGACC 
GGAGATG21CC 
GGCGCTGAGC 
GGCCATGACG 
TGAGTTGACC 
AGAATTGACT 
GGAGATGACC 
GGAGCTGACG 
GGAGCTGACG 
GGAGATGACC 
GGAGATGACC 
TAAGTTAACT 
TGAGTTGACT 
TGAGTTGACT 
GGCTTTCACG 
GGCTTTCACG 



ACAAGAAGCT GGCGTGAACC CTGTTCCCAA GGCCTTCACC 

SS? S^S?^:: „_„.| c^e^ c^^^- 

^m^^ ==c?^ =^ ^^i^ 'r^-^:^ CO.™ .O^C.^T ...CCCCC^ ^TXCXCC 

-------- CCTTTTGCT CCTGGCTACC CTACCCCCCG TGCTATTACT 

GAT-GACGTT GTCGGACCTA GCAACGAfi GCATATGCG CAAGGTCACG TTACCCCTCG AGCTCTCACG 

GAT-CGTGTC GTCGCTCCTT CGGCCATC l.uaiaxv=v.v:. 

1101 1111 1121 1131 1141 1151 _ llfl li;^l 1181 1191 



CTGGATGA-G 
CTGGATGA-G 
AAGCAGGA-T 
AAGCAGGA-T 
ACGGCCGA-G 
AAGGACGA-C 
AAAGATGA-A 
GTTAATGA-A 
GTCGAGCA-G 
GTTGCGGA-G 
GTTGCGGA-G 
GTTGAGGA-G 
GTTGAGGA'G 
GAAAAGCA-A 
ACCGAGGBRG 
ACCGAGGR-G 
AAGGAGGA-T 
AAGGAG6A-T 



AAGGAGGA-T 
AAGGAGGA-T 
TTGGATGA-T 
GTCAGAGA-G 
AAGGAGGA-T 
CTTGAAGA-G 
ACCGAGGA-C 



ATCGAGCAGT 
ATCGAGCAGT 
ATCGAGGATC 
ATCGAGGATC 
GTCCGTCAGG 
ATCGAGCAGT 
ATCAAACGTG 
ATAAATTCAA 
ATCCACGAGC 
ATCAAGGATA 
ATCAAGGATA 
ATTGAGGGAC 
ATTGAGGGAC 
TATGATGAAT 
TCGAGGGTCT 
TCGAGGGTCT 
ATAGAGCAAC 
ATAGAGCAAC 



ATCGAGGAAC 
ATCGAGGAAC 
ATCGAGGCTT 
ATCAAGGAGA 
ATTAAGGCGG 
ATTGAACAGT 
ATCAACAAGT 



TCAAGAAGGA 
TCAA6AAGGA 
TGAAGACCGC 
TGAAGACCGC 
TGGTGGCGGC 
TCAAGCGCGA 
TTGTTAAGGA 
TTGTGGAAGA 
TCGTCGAGGC 
TCGTGCAAAA 
TCGTGCAAAA 
TCGTCACCAG 
TCGTCACCAG 
TAGTGGATAA 
GGGTGAAGAA 
GGGTGAAGAA 
TCAAGAGCGA 
TCAAGAGCGA 



CTGGGTGGCG 
CTGGGTGGCG 
CTGGGTGGCG 
CTGGGTGGCC • 
GTTTGCGAAG 
CTGGTTTGAT 
TTTTGGTGCT 
CTTTGCCAAT 
CTGGAAGGCG 
GTTTGCGGTG 
GTTTGCGGTG 
CTTTGTGGAC 
CTTTGTGGAC 
GTTTGTTGTT 
GTTCGCCGAG 
GTTCGCCGAG 
CTACGTGGAA 
CTACGTGGAA 



GCCACGAAGC 
GCCACGAAGC 
GCTGTC21AAC 
GCTGTCAAAC 
AGCGC6CGGC 
GCGTGCAAGC 
GCTGCTAGAA 
GCAGCTTGGC 
TCTGCCCAGC 
ACGGCGAGGA 
ACGGCGAGGA 
GCTGCCAAGC 
GCTGCCAAGC 
GCTGCGAAGC 
TCGGCCAAGA 
TCGGCCAAGA 
GCGGCAAAAC 
GCGGCAAAAC 



GCGCCATGGC 
GCGCCATGGC 
GGGCTGTTAA 
GGGCTGTTAA 
TAGCGGTGCA 
GGGCCATTGC 
GAGCTGTTGA 
GGGCTGT6GA 
GTGCCCTCAA 
TCACGGCCGA 
TCACGGCCGA 
GTGCCATCGA 
GTGCCATCGA 
GTGCAGTTGA 
GGTCAAATCG 
GGTCAAATCG 
GAGCCATCCA 
GAGCCATCCA 



CG CCGGT 

CG CCGGT 

QQ CCGGA 

GG CCGGA 

GG CTGGG 

CG CTGGC 

AATCAGTGGC 
AATCTCAAAA 

GG CCGGC 

GG CCGGG 

GG CCGGG 

GG CCGGC 

GG CCGGC 

AA TAGGT 

A GCTGGT 

AG CTGGT 

TG CTGGT 

TG CTGGT 



GCGGACTTTG 
GCGGACTTTG 
GCCGACTTTA 
GCCGACTTTA 
GTGGATGTTA 
GCGGACTTCA 
TTTGATGCAG 
TTCGATGCCA 
TTCGACCTCA 
TTCAATGGCG 
TTCAATGGCG 
GTCGACATTA 
GTCGACATTA 
TTTGATGTAA 
TTTGACGTCA 
TTTGACGTCA 
TTCGATGTTA 
TTCGATGTTA 



TCGAGATTCA 
TCGAGATTCA 
TCGAGATCCA 
TCGAGATCCA 
TCGAGATCCA 
TCGAGATCCA 
TTGAGATTCA 
TTGAAATACA 
TTGAGATCCA 
TGGAGATCCA 
TGGAGATCCA 
TTGAGATTCA 
TTGAGATTCA 
TTGAAATTCA 
TTGAGATCCA 
TTGAGATCCA 
TCGAAATTCA 
TCGAAATTCA 



TCAAGAATGA 
TCAAGAATGA 
TCAAGAAGGC 
TGGTCCAAGA 
TGATTGAGGG 
TGAAGGAGGA 
TGCAAGACAA 



CTTTCTGGCT 
CTTTCTGGCT 
6TTTGGAGAG 
CT6GGCGACA 
TTTTGCCCAC 
CTTTGTTTCC 
ATTCGTTCAG 



GCAGCUAAAC 
GCAGCHAAAC 
GCGGTCAAGC 
GCAGCGAAAA 
ACGGCCGAGT 
GGTGTTCGTC 
TCGGCACGAT 



GAGCCAWCCG CGC- 
GAGCCAWCCG CGC- 
GGGCATTGAA GGC- 
GGGCGGTGAA AGC- 
ACCTTGAAAA GGC- 
GAGC66TTGA AG— 
GGGCGTTTGA AG— 



— TGGT 
— TGGT 
— TGGA 
— GGGC 
— CGGT 
-CAGGA 
-CTGGG 



TTTGATGTCA 
TTTGATGTCA 
TTTGATGTTA 
GTGGATGTAA 
TTCGACGGTA 
TTTGACACTA 
TATGACTACG 



TCGAGATCCA 
TCGAGATCCA 
TTGAGATTCA 
TCGAAATCCA 
TCGAATTGCA 
TCGACTTCCA 
TCGAACTTCA 



CAATGCGCAT 
CAATGCGCAT 
CAATGCGCAT 
CAATGCGCAT 
TGGGGCGCAT 
CAATGCCCAC 
TGGTGCTCAT 
TTGTGCTAAT 
CGCCGCCCAC 
TGCGGCGCAT 
TGCGGCGCAT 
CGGCGCTCAC 
CGGCGCTCAC 
TGGCGCTCAT 
CGCCGCTCA- 

CGCCGCT 

TGCAGCTCAT 
TGCAGCTCAT 



TGCAGCTCAT 
TGCAGCTCAT 
CAATGCTCAC 
CGGCGCGCAT 
CGCCGCCCAC 
TTTCGCTCAC 
CAGCGCTCAC 
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SEQ 1 
SEQ 2 
SEQ 4 
SEQ 5 
SEQ 7 
SEQ 9 
SEQ 11, 
SEQ 13 
SEQ 15 
SEQ 17 
SEQ 18 
SEQ 20 
SEQ 21 * 
SEQ 23 
SEQ 25 
SEQ 26 
SEQ 28 
SEQ 29 
SEQ 32 
SEQ 34 
SEQ 36 
SEQ 37 
SEQ 3 9 
SEQ 41 
SEQ 43 
SEQ 02 
SEQ 84 



GGATACCTGC TGTCGTCATT CCTCTCGCCG GCCGCCAAC "~ 

GGATACCTGC TGTCGTCATT CCTCTCGCCG GCCGCCAAC " ~ ~ ~ 

GGCTATCTTC TGATGTCGTT CCTCTCCCCT GCGGTCAAC I 

GGCTATCTTC TGATGTCGTT CCTCTCCCCT GCGGTCAAC ~ ~~ ~~ ~ 

GGCTATCTCA TCAACGAGTT CCTGAGCCCG GTCACGAAT _ ~ ~~~~~ 

GGGTATCTTC TCTCGTCTTT CCTATCACCG TCTTCCAAC- " 

GGTTATTTGA TTAATGAGTT CTATAGTCCT ATTTCAAAC "~~ ZIIIII 

GGATGTTTAA TACACCAATT TTTAAGTAAA TTGACAAAC 

GGCTACCTCA TTTCCGAGTT CTTGAGCCCC ATCTCCAAC " IIIII IIII 

GGMACCTGT TGGCGCAGTT CTTGAGCAAG AAGACAAAC- " 

S?S^C ^^^l ii;;™ -.Z^Torl-i ^^^^^ l-^Zl i^^CTGTG^ CTTC^J^A 

GGTTACCTSA TCACCGAGTT CCTTTCGCCG CTATCAAAC " ^ 

GGTTATCTTA TATCGTCAAC AGTTAGTCCT GCCACTAAT ~ J^Z 



GGATATCTAC TGCATCAATT CTTGAGTCCG GTAAGCAAT 

GGATATCTAC TGCATCAATT CTTGAGTCCG GTAAGCAAT 



GGATACKTGC 
GGATACKTGC 
GGATACCTGC 
GGGTACCTCA 
GGTTACCTGC 
GGTTATCTTG 
GGATACCTGA 

1301 



TTCACCAGTT 
TTCACCAGTT 
TCCACGAATT 
TCCACGAATT 
TGGCCCAATT 
TTTCCAGCTT 
TGCACTCGTT 

1311 



CTTGAGTCCA GTCAGTAAC- 
CTTGAGTCCA GTCAGTAAC- 
CATCTGCCTG AGAGCAACA- 
CCTCTCACCC ATTACCAAC- 
CCTGTCCGAA ACAACCAAC- 
CCTGTCCCCT GCCACCAAC- 
CCTCAGCCCG TTGACCAAT- 



1321 



1331 



SEQ 1 
SEQ 2 
SEQ 4 
SEQ 5 
SEQ 7 
SEQ 9 
SEQ 11 
SEQ 13 
SEQ 15 
SEQ 17 
SEQ 18 
SEQ 20 
SEQ 21 
SEQ 23 
SEQ 25 
SEQ 26 
SEQ 28 
SEQ 29 
SEQ 32 
SEQ 34 
SEQ 36 
SEQ 37 
SEQ 39 
SEQ 41 
SEQ 43 
SEQ 82 
SEQ 84 



SEQ 1 
SEQ 2 
SEQ 4 
SEQ 5 
SEQ 7 
SEQ 9 
SEQ 11 
SEQ 13 
SEQ 15 
SEQ 17 
SEQ 18 
SEQ 20 
SEQ 21 
SEQ 23 
SEQ 25 
SEQ 26 
SEQ 28 
SEQ 29 
SEQ 32 
SEQ 34 
SEQ 36 
SEQ 37 
SEQ 39 
SEQ 41 
SEQ 43 
SEQ 82 
SEQ B4 



— AACCGCAC GGACCAGTAC 
— ^AACCGCAC GGACCAGTAC 
— ^ACGAGAAC AGACGAGTAC 
— ACGAGAAC AGACGAGTAC 
— ^AAGCGGAC GGATGCGTAC 
—ly^GCGCAC CGACGAGTAC 
— AASAGAAC AGATGAATAC 
—AAGAGAGC TGACCAATAC 
— CAGCGTAC GGACCAGTAC 
— ^AfiGCGCGG GGATGAGTAT 
-AGGCGCGG GGATGA6TAT 



CATTTTATTT CCTGGCACGC AGAAACGGAC AGACAAGTAC 

AAACGGAC AGACAAGTAC 

— GACCGCAA TGACAAGTAT 



GGCGGGTCGT 
GGCGGGTCGT 
GGAGGCAGTT 
GGAGGCAGTT 
GGCGGGAGCT 
GGCGGCTCCT 
GGTGGCAGTT 
GGGGGCTCAT 
GGTGGCTCCT 
GGCGGGTCGG 
GGCGGGTCGG 
GGCGGCAGCT 
GGCGGGAGCT 
GGTGGGACAT 



TCGAGAACCG 
TCGAGAACCG 
TTGAGAATCG 
TTGAGAATCG 
TTGAGAACCG 
TTGAGAACCG 
TTGAAAATAG 
TTGAAAACAG 
TCGAGAACCG 
CTGAGAACAG 
CTGAG2VACAG 
TTGAGAACCG 
•STGPiGMKCCG 
TTGAGAAACG 



CATCCGGCTG 
CATCCGGCTG 
CATCCGGCTC 
CATCCGGCTG 
GACGCGGATC 
CATCCGGCTC 
AACCAGATTT 
AGTTAGATTT 
CAGCCGCGTT 
GGGGAGGATT 
GGCGAGGATT 
CACCCGGGTC 
CACCCGGGTC 
TATTTTGTTT 



TCTCTCGAGA 
TCTCTCGAGA 
AGTCTGGAGA 
AGTCTGGAGA 
GTGCGCGAGG 
TCTCTCGAAA 
TTAAAGGAAG 
CTTTTACAAA 
CTCCGCGAGA 
GTTGGGGAGA 
GTTGGGGAGA 
CTGATCGATA 
CTGATCGATA 
CCTATGGAAG 



TTGC6CAGTT 
TT6CGCAGTT 
TCGCCAAGCT 
TCGCCAAGCT 
TTGCGGCGGC 
TCGCGCAGGT 
TTATCGATAG 
TAATTGAGAA 
TCATCTGGGC 
TTATTAAGGA 
TTATTAAGGA 
TTATCAAGGC 
TTATCAAGGC 
TTGTCCATTC 



GACTCGGGAC 
GACTCGGGAC 
CACCCGCGAA 
CACCCGCGAA 
TATTCGTGCG 
CACCCGTGAC 
TGTTAAATCA 
TATAAAACGA 
CGTCCGCTCC 
GTGCAGGAGG 
GTGGAGGAGG 
CGTCCGGGCA 
CGTCGGGGCA 
TGTTCGTAAA 



1411 



-CAAAGAAC CGACGAGTAT GG 

s^^M Gciic^i i^^^^i EEE^ EEEEz hi'^'^. 

^^^^^^^^^^ TATCAGAGTC GTCTTGGAfSA TCATTG 

TCGAGAACCG TATCAGAGTC GTCTTGGAGA TCATTG 

GGGAAAACCG CACTCGTCTG ACAATGGAAA GTCGTCGACC TTGTCCGCAG 
TCGAAAAGCG TACCCGTCTA CTCATTGAAA TCGTAACAGC CGTCCGAGCC 
TCGAAAACCG CATGCGGCTA ATCCTCGAGG TCACGGCCGA GGTCCGCAGG 
TCGAGAACAG AGTGCGCCTT GCTCTCGAGA TTGTCGAGGC TGCACGAGCT 
TCGAGAACCG CGCTCGATTT CTGCTCAACG TTGCCCGTCG AATCCGCCAA 

1451 1461 



— CAAAGAAC 
— CAAAGAAC 
— CCAGGACC 
— CGCCGGAC 
— CAGCGCAC 
— ^AAGCGTAC 
— CAGCGTAC 



1421 



CGATGAGTAT 
GGATGAGTAT 
GACAAGTACG 
AGATTCTTAC 
CGACGAGTAC 
CGACAAGTAC 
CGACGAGTAC 

1431 



GGTGGCAGGT 
GGTGGGAGCT 
GGCGGAAGCT 
GGCGGTTCTT 
GGCGGCAGCC 
GGAGGTAGCT 
GGCGGTAGCC 

1441 



GCCGTCGGCC 
GCCGTCGGCC 
AATGTGCCCA 
AATGTGCCCA 
GTGATTCCCG 
GCCGTCGGCC 
AGTATTCCAA 
AAGATAGAAA 
GTCATCCGCG 
CAGGTGACTG 
CAGGTGACTG 
GTGATTCCCG 
GTGATTCCCG 
GCAATTCCAG 



CTCATGTGCC 
CTCATGTGCC 
AGGATATGCC 
AGGATATGCC 
AGGGGATGCC 
CCAACGTTCC 
AGGATGTTCC 

CA CC 

AGGACATGCC 
AGGCGGTGGG 
AGGCGGTGGG 
AGGAGATGCC 
AG6AGATGCC 
AXAGTATGCC 



C 

T 

T 



GTTTT CCTGCGCATT 

GTTTT CCTGCGCATT 

GTCTT CCTGCGGGTC 

GTCTT CCTGCGGGTC 

CTGTT TCTGCGTATC 

GTTTT TCTCCGTGTC 

GTGTT TTTGAGAATC 

^ATTTT CTTAAAGTTT 

CTCTT CGTCCGTGTC 



T 

A-- 

G , 

C 

TGAAGAGGAG GCGAAGAAGT TTGTGGTGGG AATCAAGCTG 
TGAAGAGGAG GCGAAGAa.GT TTGTGGTGGG AATCAAGCTG 

^ CTCTT CGTCCGAATC 

CTCTT CGTCCGAATC 

Q „ TTGTT TTATAGAGTA 



TCGGCCTCGG 
TCGGCGTCGG 
TCCGCCACCG 
TCCGCCACCG 
AGCGCCACGG 
TCCGCGACGG 
TCTGCTGCTG 
CCAATGTCAG 
TCCGCCACCG 
AACAGT6CGG 
AACAGTGCGG 
TCCGCCACCG 
TCCGCGACCG 
ACGGCTACAG 



1471 

•*■*** + ***+* 
ACTGGTGCGA 
ACTGGTGCGA 
ATTGGCTGGA 
ATTGGCTGGA 
AGTGGTTGGA 
ACTGGATCGA 
AftAATAGTCC 
ATAATTGTAG 
AGTGGATGGA 
ATTGGCAGGC 
ATTGGCAGGC 
AATG6ATGGA 
AATGGATGGA 
ATTGGTTGCC 



1481 



1491 



6GAGACCCTG CCGGA 

GGAGACCCTG GCGGA 

GGAGGTGCAG CCGAA 

GGAGGTGCAG CCGAA 

GGGTCAGCGG GTGGC 

GGAGACCCTG CCGGA — 

TGATCCA 

TGATCCG 

GTACACC 

GGGACGCGAT GGA ^A 

GGGACGCGAT GGAA2M3 

GTACGCCGGC 

GTACGCCGGC 

CAAAGGACAA 



Gc^iccccG 11^;^^ zzziz'^iii cciTCGTGTc agtgcaactg attggttcga gtttgactct caattcaaag 

CATT ZZZZZ Z mill IIIIIcTCTT CCTCCGCCTC TCCTCTACAG AATGGATGGA AGATACCGAC ATCGGC 

GCGATGCCCT CCAGCATGCC T CTCTT ^^^^^ AACAGCGTCG AGTTCCAGGA GAAG 

CGGACGAGCA AGAATTTCAT C IIIHSgS ScTCGCATC AGTGGAACTG ACTGGCTGGA GAACAACCCT GAG 

Sccc?^ Sgt-- :::::::::: ::::_Sc?g SSSc agctccaccg actgggccga ccaagcgcac caa- — 
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SEQ 1 
SBQ 2 
SEQ 4 
SEQ 5 
SEQ 7 
SEQ 9 
SEQ 11 
SEQ 13 
SEQ 15 
SEQ 17 
SEQ 19 
SEQ 20 
SEQ 21 
SBQ 23 
SEQ 25 
SEQ 26 
SEQ 28 
SEQ 29 
SEQ 32 
SEQ 34 
SEQ 36 
SEQ 37 
SEQ 39 
SEQ 41 
SEQ 43 
SEQ B2 
SEQ 84 



SEQ 1 
SBQ 2 
SEQ 4 
SEQ 5 
SEQ 7 
SEQ 9 
SEQ 11 
SEQ 13 
SEQ 15 
SEQ 17 
SEQ 18 
SEQ 20 
SEQ 21 
SEQ 23 
SEQ 25 
SEQ 26 
SEQ 28 
SEQ 29 
SEQ 32 
SEQ 34 
SEQ 36 
SEQ 37 
SEQ 39 
SEQ 41 
SEQ 43 
SEQ 82 
SEQ 84 



SEQ 1 
SEQ 2 
SEQ 4 
SEQ 5 
SEQ 7 
SEQ 9 
SEQ 11 
SEQ 13 
SEQ 15 
SBQ 17 
SEQ 18 
SEQ 20 
SEQ 21 
SBQ 23 
SEQ 25 
SEQ 26 
SEQ 28 
SEQ 29 
SEQ 32 
SEQ 34 
SEQ 36 
SEQ 37 
SEQ 39 
SEQ 41 
SEQ 43 
SEQ 82 
SBQ 84 



GCAGAGCTGG 

GCAGAGCTGG 

CAA GCCCAGCTGG 

CAA GCCCAGCTGG 

-CGCGGAGTC GGGCAGCTGG 

GGAATCGTGG 

. GAAGCTTGG 

G2U«3CGTGG 

GGCCA GCCCTCGTGG 

A6GAGGAGGA GGAGACGGAT 
— GA6GAGGA GGAGACGGAT 

GA GCCTAGCTGG 

GA GCCTAGCTGG 

GGATGG 



AAGTCGGAGG 
AAGTCGGAGG 
CGAGGCGTGG 
CGAGGCGTGG 

GATAT GC 

AAGCTCTCTG 
ACTATTGAAG 
TCTACGGAAG 
GACCTCCAGC 
ACGGCGGAGG 
ACGGCGGAGG 
GACCTCGAGC 
GACCTCGAGC 
GAGATAGAAG 



ATACCGTGCG 
ATACCGTGCG 
ACACTGTCCG 
ACACTGTCCG 
AGAGCTCGCT 
ACTCCGTCCG 
ATTCCAAAA- 
ATGCATTGA- 
AGACCATTG- 
AGGTGTTGA- 
AGGTGTTGA- 
AGAGCACAC- 
AGAGCACAC- 
ATACAGTTG- 



GTTCGCGCAG 
GTTCGCGCAG 
ATTTGCGAAG 
ATTTGCGAAG 
GGAGCTGGTC 
CTTCGCCGAA 
— AATTAGCT 
— AGTTGGCC 
— AGCTCGCC 
— AGCAGATT 
— AGCAGATT 
— AGCTTGCC 
— AGCTTGCC 
— CATTAGCA 



GAGCTGGTCA 
GAGCTGGTCA 
ATCCTGGCAG 
ATCCTGGCAG 
AAGAAGCTGC 
GCCCTCGCTG 
GACATTTTAG 
GATCTTGTTA 
AAGATCCTCC 
GAGCTTTTTG 
GAGCTTTTTG 
AAGCTCCTCC 
AAGCTCCTCC 
GCGAGGCTTC 



AGCAGGGCGC 
AGCAGGGCGC 
AAACGGGTTA 
AAACGGGTTA 
CCGAATGGGG 
CCCAGGGCGC 
TAGAAAAGGG 
TTGATTTAGG 
CCGACCTCGG 
AGCAGTGGGG 
AGCAGTGGGG 
CGGACCTGGG 
CGGACCTGGG 
GCGATGGTGG 



CGTTGATCTG 
CGTTGflffCTG 
CGTTGACGTG 
CGTTGACGTG 
CATTGACCTG 
TATTGACCTG 
TATTGCTTTG 
AGTAAAGGTG 
CGTCGACCTC 
GATCGACTTT 
GATCGACTTT 
TGTCGACCTG 
TGTCGACCTG 
TGTTGACTTG 



1581 

********** 
ATCG2\TA!ICA 
ATCG2U?ATCA 
CTTGACGTGA 
CTTGACGTGA 
GTGGATGTCA 
ATCGACGTCT 
GTTGAXGTTT 
ATCGACGTTA 
CTCGACGTCT 
GTCGAGGTTA 
GTCGAGGTTA 
CTCGACGTCA 
CTCGACGTCA 
ATAGATGTTA 



1591 

GCAGCGGTGG 
GCAGCGGTGG 
GCAGTGGC6G 
GCA6TGGCGG 
GCTCCGCCGC 
CTTCCGGCGG 
CATCTGGTGG 
CATCAGGTGG 
CTTCCGGCGG 
GCGGTGGCAG 
GCGGTGGCAG 
GCTCGGGCGG 
GCTCGGGCGG 
GCTCTGGTGG 



Ic'^Itt'tIc ii^ciS I^I^icG^C ;^ra---G Tf*^^^f ^fEE!!^ fEEEE STGG^CA GCTCAGGCGG 



AZWsSU^GTT CGGAAGCTGG GATGTCGAAA GCACGATCA- 

GGTTTCAAG CCA GAGG AGGCGGTGC- 

— TACGAGGG AGAGACCTGG ACTCTTGAGC AGAGCATCA- 
GC CGACTCTTGG ACCGTTGACC. AG2VCGGTTG- 



—AGATCTCC AAAATCCTGG CCGACTTGGG CGTTGATCTG CTCGACGTGT CTTCCGGTGG 
—AGTTGTGC GAGGCCCTCG AGGCCGCGGG CATGGATTTT GTCGAGACGA GCGGCGGCAC 
— AGCTTGCA CACCAGTTAG CAGACCGTGG TGTCGATGTT TTGGATGTTT CCAGTGGTGG 
— AACTCGCC AAGATGCTCC AAGAGGCTCG AGTCGACCTG CTAGAC6TCA GCTCGGGCGG 



1601 



1511 



1621 



1631 



1641 



1651 



1661 



1671 



1681 



1691 



########## " z III IIIIIIII 

TGTTCTCGCG CAG " _~" ""^ " 

TGTTCTCGCG CAG " _ ^ ~ ~ ~~ 

CACTCATTCG GAG _ "1" H" 

CACTCATTCG GAG 

GAACCACAAG GAC 

TGTCCACGCC GCG ~ ~~ ZZZZ IIIIIIII 

TAACGATTAT AGA ~ IIIZI - 

AAATGTTGCG CAT " _ 

^^G^° '^TckZY^ Graii^ici '^^^^^ Ji'G^^ ^EEEE EEEE^ tf!f!!f^ f 

TTATGAGGAT CCTCAG " IIIIIIZIII 

AAACTCGGTG GCG IIIIIIIII IIIIIIIIII III 



TATCCATCCT AAG 



GAATCATCCT CAG 

CTATGAGAGT TTT- 



CATCCACAAG ATG- 
CCTGGTTCCA TTC- 



1721 



1731 



1701 1711 

IIIIIIii#i nmmm ########## 

CAG AAGATCAAGT 

CAG AAGATCAAGT 

CAG CATATCCACG 

CAG CATATCCACG 

CAG AAGATCAACC 

CAG AAGATCAAGT 

C AACCACCAAG ATCTGGGATC AGTAAAGAGT 

T GCAAATCTAG ATATCTATTA AATGACGACA 

CAG AAGATCAACG 

ATACAGATGG CCAACGGTCC CAAGCCCGAA AAGTCCGAAC 

^ATGG CCAACGGTCC CAAGCCCGAA AAGTCCGAAC 

i CAA AAGATCGAGC 

CAA AAGATCGAGC 

CAA AGAATTGAGG 



1741 

CCGGCCCTGC 
CCGGCCCTGC 
CGAAGCCAG6 
C6AAGCCAG6 
TGCACACGGC 
CCGGGCCGGC 
TGAGAGAGCC 
AACAACTACC 
TCCACACCTA 
GCACCATGGC 
GCACCATGGC 
TCACGCCGTA 
TCACGCCGTA 
TGAAGGATTG 



CTTCCAGGTG 
CTTCCAGGTG 
CTTCCAGGCA 
CTTCCAGGCA 
CTACCAGACG 
TTTCCAGGCT 
AATCCATGTT 
TTCTCA2U3TG 
CTACCAGATC 
CCGCGAGGCC 
CCGCGAGGCC 
CTACCAGATC 
CTACCAGATC 
CTATCAAGTT 



CCTTTTGCCG 
CCTTTTGCCG 
CCCTTTGCTA 
CCCTTTGCTA 
GACCTGGCCG 
CCCTTCGCTG 
CGGTTGTCTC 
CCCTTGGCTC 
GACATGGCCG 
TTCTTCCTCG 
TTCTTCCTCG 
GACCTGGCAG 
GACCTGGCAG 
CCTTTTGCGG 



TGGCCGTGAA 
TGGCCGTGAA 
TTGCCGTCAA 
TTGCCGTCAA 
GGCAGATTCG 
TGGCTATCAA 
6TGCAATTAA 
GTAAATTGAA 
AGCAGATCCG 
AGTTC6CCAA 
AGTTCGCCAA 
CCAAGATCCG 
CCAAGATCCG 
AAAAGATTAA 



GAAGGCCGTC 
GAAGGCCGTC 
GAACGCCGTC 
GAAGGCCGTC 
CCAGGCCATC 
GAAGGCCGTT 
ACAACATGTT 
AAGCCACATT 
CGCGGCCGTG 
GATCATCCGC 
GATCATCCGC 
CGAGGCCGTC 
CGAGGCCGTC 
GGATCAAGTG 



GGCGAC 

GGCGAC 

GGGGAC 

GGGGAC 

CGAGCG 

GGCGAT 

GGTGAC 

AGA2UVC 

CACGAGGCCG 

ACCAAG T 

ACCAAG T 

GGCGAT 

GGCGAT 

AATGGA 



Itccgccatc gccatcaagt ccggtcctgc ttaccaggta gacctcgcca aacaggtaaa gaaggctgtt ggcgat---- 



G Sii^GCG^ ccgcaagSg S^SSSS^ mcgS^ ctattttatc gagttcgccg aggtcatccg caaggccgtc aagcac 

-G gttttgcgca ccg^^2!Sa aaggtcStg ctggtcccgg ttaccaggca cctcttgcca aggcgatcaa gaagtcagtt ggagac 

IIIIIIII IIIIII-S^ ^?S?G Sggmccgg ataccagcta ttcggagcaa aagccgttcg cgatgctctg gccaaa— - 
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1801 IBll 1821 * _ 183X _ 1841 1851 1861 1871 1881 xe^l 

IIIIIIIIII MiCG GCAAGCAGGC 

SEQ 1 ^AAGCT GCTGGTTGCC GCCGTGGGTG CCATCACC t _ _ _ AACG GCAMCAGGC 

SEQ 2 ^AAGCT GCTGGTTGCC GCCGTGGGTG CCATCACC _ CGCATTTGGC 

SEQ 4 ^AAACT CGCAGTGGCA TCA5TGGGTA TGATTGCC I AGCG CGCATTTGGC 

To ? Srci ^;^ci^ ici^^^-o c'c^cci^cT ^.XC...3. .C.=.C^.. C.^T^^ 

SEQ 9 ^AAGCT CCTTGTTGCG ACGGTGGGCA CGATCACG 7 ' ^ AAGATCCTGA 

SEQ 11 AAGTT ATTGGTCAGT TGCGTTGGTG GGCTTGAA ^ GAGACATATT 

III II ^^iicSS?? ^??cS ^^ci: :::::=;c; -300^- oc.^^^c 

SEQ 17 TCCCCAAGCT TCCTCTCATG GTCACCGGCG GCTTCCGC ^^^^ GTCAGGGCAT 

SEQ 18 TCCCCAAGCT TCCTCTCATG GTCACCGGCG GCTTCCGC " ^^gg CTGACATTGC 

SEQ 20 AGGTT GCTCATAGGC GCGGTCGGCA ACATCAAC _ j^^^q CTGACATTGC 

SEQ 21 AGGTT GCTCATAGGC GCGGTCGGCA ACATCAAC GTCTTACGGC 

SEQ 23 AT ACTACTTGGC GCTGTCGGAA TGATCAGG __2'l"ZZZ 

SEQ 29 ■■ „ „ CTGGACATCT 

SEQ 32 ^AGTGT ACTTGTTTCA GCAGTAGGTG GAATCAAG " 

SEQ 41 I'lr""!" IIIIIIIIII III GTGGGCG CCATGGTCGA 

SEQ 43 ^ATGGT GGTCTACACC ACCGGCGGCT TCAAGACG _ _ GTACCCTTGC 

SEQ 82 ^AAGAT GTTGATCAGC ACTGTTGGTA GCATCAAG CCGTGG GAATGATGGA 

SEQ 84 — ATCGAACC CGACGCGTCC AAACGCATGC TCGTCGGGG 



1901 1911 1921 1931 1941 



1951 1961 1971 1981 1991 



SEQ 1 GAATCAG ATTCTAG AGGAGCAG — IIIII II 

SEQ 2 GAATCAG ATTCTAG AGGAGCAG III 

,SEQ 4 ' CAATTCC TTGTTGG AGAAGGAC — " " : 

SEQ 5 CAATTCC TTGTTGG AGAAGGAC — ~~ ~~" 

SEQ 7 CGAGGCAATG CTGTCGGGAC CTGAACCC— ^ ~" 

SEQ 9 GAACAAG CTGCTTG AGGAGGAG— _ 

SEQ 11 ATTGCTCAAC AAATATTTAG AAGAAGGA I_IIIII III • 

SEQ 13 TAAACTCGAT GAGTTTATTG CTAATGGT ~ 

SEQ 15 CATCCAGCGC GAGAACGGCG CCAAGACT """" 

SEQ 17 GGAGGCC GCTTTGG AATCCGAT " "11 

SEQ 23 GAATGAAATC CTAGAAAGTG GAAAAGCT IIIIIIIIII 

SEQ 32 TGCTGAA GAGGTTT TGCAATCT ~~~ IIIIIIIIII IIIIIII 

-SEQ 43 CGCGCTGCAG GGCGTCGATG GG 



SEQ 82 GGAGGAG ATCATCG CTGGAGGAGA GGACGATACC 

SEQ 84 AGGTTCC tTACGATT CGCCCAAC 

2001 2011 2021 2031 2041 . 2051 



ct?n 1 TATATCGACG TTGCGCTGGT TGGCCGTGGG TTCCAGAAGG ATCCCGGTCT GGCCTGGACG TTTGCTCAGC acctcggcgt C- 

slS 2 GA?a?CGACG ??GCgSgGT TGGCCGTGGG TTCCAGAAGG ATCCCGGTCT GGCCTGGACG TTTGCTCAGC ACCTCGGCGT C- 

slS 4 ggIcJggacc ??gtgctggt tggacgtggc ttccagaaga acccggggct ggtgtgggcg tgggccgacg agctgaatgt A- 
lln s g^Sggacc ttgtgctggt tggacgtggc ttccagaaga acccggggct ggtgtgggcg tgggccgacg agctgaatgt a- 
S2 n SSgcIgmg c^ttctgat agcccgtcag ttcctgcgcg agccagaatg ggtgttttcc acggcgagaa agttgggcgt g- 

SS 9 ^?5gGMG 5ScGC?TG? GGGACG^St TTCCAGAAGG ATCCCGGTCT GGCGTGGACT TTCGCGCAGC ATCTTGATGT T- 

11 1S?t?Stc ttgctttgat cggtagagga tttttaagaa atccaggttt ggtatgggag tttgccgata aacttggtgt t- 

iE E SIS ?rc= '^^c^ii cSc? friiii^f. i- 
SSE i^^^ =ss g : 

III II ^^^^ I^IIh ^5S-5»= ---- ^' 



SEQ 23 GMG iiiESiiS iiSi^GGRG TTCTTAAGGA ACCCGTCGIT GGTGCT«3ftC AGCGCGSACC AGTTGGGTGA 

SEQ 25 

SEQ 2 6 



S.0 ^G^gI^C iGci^CC^ ;™G -^^^ .^GCX^ t 

SEQ 23 
SEQ 25 
SEQ 2 6 

SEQ 28 _ 

sIq II GG^MC^^I i^TCAGGGC ii^OTTCG iic^^l^MA M^TCOTCT GGITCGMCI TTTGCTAACG AGCTTGGCGT G-------- 

ii n z:~~zzz zht^iZt zi^i^ 'o^iiiiTdi iiiii'^zi^ iii'c^izi i^'^Tdc'^ '^^^ Sfff ™ ^i^"^ 

fi i^^c fGS fc^= =^ -™ — 



r 



SEQ 25 
5EQ ZS 
SEQ 28 



SEQ 36 
SEQ 37 
SEQ 39 
SEQ 4*1 
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2131 2141 
11- 



2181 



Ilel^ TCTCCATGGC CAACCAGATC CGCTGGGGCT TCACCCGGCG TGGAGGCACC CCGTACATTG ATCCTTCGGT 

SEQ 1 --GAAA JCTCCA^GGC ^^CAGATC ^ TCACCCGGCG TGGAGGCACC CCGTACATTG ATCCTTCGGT 

SEQ 2 II-SS ?CTcS?GGC TAAtSaStC CGATGGGGTT TCTCGCGGCG CGGTGCTGGT CCTTACCTCR GGAAGAAflCT 

EQ 4 :::: ?c?cS?gGC ?^CA^?c CGATGGGGTT TCTCGCGGCG CGGTGCTGGT CCTTRCCTCA GGAAGAAACT 

ZZZ Z, CCGG TGACTGTCCC GGTGCAGTTT GGCAGGGCCA TTTAG " 11™***^,- 

11^ I : Sga ttgcgItggc gagtcagatt cggtggggat tcacaaggcg cgggggcacg ccttatatcg accccaaagc 

lln ?i tccac^Sgc cttgcagtta ggttggggtt tctggcccaa caaacaacaa attgttgatt tgattgaaag 

13 m — cAAT tcagaacagc acctcaatat aagttggcct tatcataa 7:::i"iimn 

GATG tcaaggcccc tgttcagtac ctccgtggtc ctcttagcag caggcccaag aagttgacca ctgttcctta 

SEQ IS :::„SgG IS^gSgC CCGci?GTTC GAduWSAAGA GGGCTGAGCC GCACTGGATC GTTGAGAAGT TGGGCATGAA 

Hi :::::::::: :::::.cCGG A?GCGGA?GC CCgStGTTC GACAAGAAGA GGGCTGAGCC GCACTGGATC GTTGAGAAGT TGGGCATGAA 

m on I "I - - ^AATG TGCAGTGGCC TCACCAATAC CACAGAGCAG TGTGGCGCAA GGGTGCAAGG ATTTGA 

foS - ^AATG TGCAGTGGCC TCACCAATAC CACAGAGCAG TGTGGCGCAA GGGTGCAAGG ATTTGA 

SEQ 23 ~ - AATG TTGCATGGCC AGTTCAGTAT GACTATGCAG TTAAGGGACA CAGAAAGTTA CGTTGA 



III 32 Illliri" IIIIIIIIII -.II-II^GG i^^GA^GG^ ^C^IgItT GATTGGAGCT TCAAGGGACG TGGAAAGAAA GTGAACAAGA fTTCTTTAIA 



III tl icrG^-G^ C^i^^G CTGCAGTTGA CTGCCTGCTC GGCGCAAATA AGGCTGATGG CCAAGGGC^ GGAGCCGTTT GAC---^--- 

III II 7g7c17^11 '^Tg^II ^ggatgcS g5gS5 S^gggtacc agggcagctg gcaacccgca gtaccatcgc gttcacgtgg ctaagaagtg 



SEQ 1 gtacaagcag tctattttcg atgtatag— 

SEQ 2 gtacaagcag TCTATTTTCG ATGTATAG IIIIIIIIII I 

SEQ 4 CGAGAAGATA TAA "~ IIHIIIIII IIIII 

SEQ 5 CGAGAAGATA TAA IIII'I IIIIIIIIII - 

SEQ 9 TTATAAGGAG AGCATCTTTG AGTJy^ : ^I IIII IIII 

SEQ 11 AACATCTAAA TTAGAAGTAA ATTAG IHIIIII IIIIIIIIII II— I 

ii li GTl'^ili'T ZTo^T^t'g tT^Tgg^ i^si^ii c^ccS; 7t^t'^ ^EE^ f!!™™ !f!!!!f!^ 

SEQ 18 GTCCATTGTT GGTGCTGGTG TTGAGGTG IIIIIIIIII III 

SEQ 37 
SEQ 39 



SEQ 41 IIIIIIIIII IIIIIIIIII III ATCTC AAACGCC6AC GAGGTGGCGC GGGTGACGCA GTTGATGGCG 

SEQ 84 A " 



SEQ 1 IItZIHH i^Si^I TgItIIItcI T^ll'^CK MGGACCCTT ^C^TATTATT TCTCGTCTCC TGCGTATGTT ^CAAG^TATTC ACAGTAGCTG 

isQ 2 AGTATAGATA GAGTTGAAGA TGATACCTCA TAGACGATCA ATGGACCCTT fCATATTATT T--------- --"jll 

SEQ 17 ACGTGGTATG TGAGCGAGCT CAAGAAGCTG GCCAAGTTTT AG " IHIIIHII HIIHIII 

SEQ 18 ACGTGGTATG TGAGCGAGCT CAAGAAGCTG GCCAAGTTTT AG------- "111111111 ^^H^HI 

SEQ 43 GAGGGCAAGG TG IIIIIIIIII IIIIIIIIII IIIIIII 
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SEQ 1 
SEQ 2 
SEQ 4 
SEQ 5 
SEQ 7 
SEQ 9 
SEQ J.1 
SEQ 13 
SEQ 15 
SEQ 17 
SEQ 18 
SEQ 20 
SEQ 21 
SEQ 23 
SEQ 25 
SEQ 26 
SEQ 28 
SEQ 29 
SEQ 32 
SEQ 34 
SEQ 36 
SEQ 37 
SEQ 39 
SEQ 41 
SEQ 43 
SEQ 82 
SEQ 84 



SEQ 1 
SEQ 2 
SEQ 4 
SEQ 5 
SEQ 7 
SEQ 9 
SEQ 11 
SEQ 13 
SEQ 15 
SEQ 17 
SEQ 18 
SEQ 20 
SEQ 21 
SEQ 23 
SEQ 25 
SEQ 2 6 
SEQ 28 
SEQ 2 9 
SEQ 32 
SEQ 3 4 
SEQ 3 6 
SEQ 37 
SEQ 39 
SEQ 41 
SEQ 43 
SEQ 82 
SEQ 84 



CGTCcicTTA iCTTicTCCG TCMTCCTTC Tfliici^ic C^icGC^C GCATGGCGAC CACGGATCGA GTCGAMTTC TCCGTCGTTC GTMCTGATC 



AATATAAARA 



GCGGGGAATG GCTTGACCCC GCGCAGAATG TCGATCTCTT CGCAAACTCT CGGTGTATAG GAGGCTCAfiC AACGATCAAG G 



Figure 2. A multiple alignments of the 2031 OR nucleic acid 
sequence from A. fumigatus (SEQ 1,2) along with related 2031 ORs 
from other fungi and bacteria (see also Example 4) . Regions 1-11, 
marked with * or #, refer to regions conserved at the amino acid 
level between Ors but not OYEs - 



4, 5, 
11 and 



Fungal 2031 ORs are given by SEQ ID No;: SEQ ID Nos - 1, 2, 
and Ir A. fumlgatus'r SEQ ID No. 9, A.nidulans'r SEQ ID Nos. 
13, C. albicans? SEQ ID Nos. 15, 17 and 18, N. crassa? SEQ ID 
Nos. 20, 21 and 43, M. grlsea? SEQ ID No. 23 (NP_595868), S. 
pombe? SEQ ID^Nos. 25 and 26, C. trlfolll) SEQ ID Nos. 28, 29, 
31, 32 and 34, F. sporotrlchloldes ; SEQ ID Nos. 36, 37 and 82, F 
gramlnearum; SEQ ID Nos. 39 and 41, M. gramlnlcola/ SEQ ID No. 
84, U. maydls. 
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Oh 0.5h 1h 2h 4h 24h 
_ - + - + -+- + - + IPTG 



V* --y-^-n - - >!<^"^ '~*^" ^***'^ 

^^^^ . I ?i 



^ ^.i^st^ i-^^ ^t^^ ^ i 1; 



^# ^ ^ r_ S't?' ^ 

ii^i^ *«Whs«* iJ^"^ ^ 



xJi ^ W=?>> T^-f^ '•gj^.r*?*^ ^ 



-4, J. 



ir^-.^ ^-t ^ ^"^^ 

^ ^ ^ ^> ^ . 



B 



Figure 3. Recombinant 2031 OR. (A) Time course of recombinant 2031 OR induction 
over 24 hours after the addition of IPTG (samples without IPTG are also shown). The gel 
was stained with coomassie; A prominent band of the correct molecular weight (marked 
with an arrow) is seen. (B) Coomassie stained gel showing purified recombinant 2031 . 
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4961; A fumigatus 



100 



SEQ ID No. 43; M gnsea 



SEQ ID No. 19; N, crassa 



100 



SEQ ID No. 14; C. albicans 



97 



SEQ ID No, 12; C. albicans 

SEQ ID No. 24; S. pombe 

SEQ ID Nos. 30 33; F. sporotrichioides 

SEQ ID No. 6; >A. fumigatus 
SEQ ID No. 3; A. fumigatus 



100 



100 



|lQQ 



55 



SEQ ID No. 10; A nidulans 
SEQ ID No. 8; A. fumigatus 



92 



SEQ ID No. 16; A/, crassa 



97 



55 



71 



SEQ ID No. 22; M. grisea 
- — NP_295913 

NP_625402 

AF320254 

T44612 



Bacterial 



6-2460; C. albicans 



100 



■ A36990; C. albicans 



79 



82 



NCU04452.1; N. crassa 
4875: A, fumigatus 



y. Fungal 



68 



I OYE3; S. cerevislae 

100 

OYE2; S. cerevisiae 



93 



OYEs 



OYE1; S. cerevisiae J 



Fungal 
2031 ORs 



J 



Figure 4. Phylogenetic tree showing relationships between A. fumigatus 2031 OR and 
similar proteins. This demonstrates a 2031 OR clade, which can be distinguished fronn 
the OYE proteins. 



r 
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40 - 
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NEM 
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DQ 



Substrate 



Figure 5: NADPH dehydrogenase activity of recombinant 2031 OR with cyclohexenone 
(CHX), N-ethylmaleimide (NEM), menadione (iVlEN) or duroquinone (DQ) as substrates. 
Final concentrations in the assay were as follows: 500 substrate, 120 \M NADPH, 1 
fj.g/200 pL 2031 OR. 
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Figure 6: Inhibition of 2031 OR function by two inhibitors (shown in A and B) identified 
by high-throughput screening. 



